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Attorney's Docket No. 012627-019 
IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re Patent Application of 

Annemarie POUSTKA et al. 

Application No.: Unassigned 
(Corresponds to PCT/DE99/01867) 

International Filing Date: 25 June 1999 

For: MODULARLY CONSTRUCTED RNA 
MOLECULES HAVING TWO 
SEQUENCE REGION TYPES 



Group Art Unit: Unassigned 
Examiner: Unassigned 



PRELIMINARY AMENDMENT 

Assistant Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

Prior to examination, please amend the above-captioned application as follows: 



IN THE CLAIMS ; 

Kindly amend the claims as follows: 
Claim 3, line 1, delete "or 2". 



Claim 4, line 1, change "any one of claims 1 to 3" to -claim 1— . 



Claim 5, line 1, change "any one of claims 1 to 4" to —claim 1— . 



Claim 6, line 1, change "any one of claims 1 to 5" to -claim 1— . 
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Claim 7, line 2, change "any one of claims 1 to 6" to —claim 1—. 

Claim 9, line 2, delete "or the gene according to claim 8". 

Claim 14, lines 1-2, change "any one of claims 9 to 13" to —claim 9—. 

Claim 16, lines 2-3, change "any one of claims 1 to 6" to —claim 1—. 

Claim 18, line 2, change "any one of claims 1 to 6" to —claim 1—. 

Claim 19, line 2, change "any one of claims 1 to 6" to —claim 1—. 

20. (Amended) [Use of the RNA molecule according to any one of claims 1 to 
6, of the vector according to any one of claims 9 to 13, of the antibody or fragment thereof 
according to claim 16 or 17, of the antisense RNA according to claim 18 or of the ribozyme 
according to claim 19 for the production of a] A pharmaceutical preparation for preventing 
or treating diseases which are connected with a disturbed control of gene expression 
comprising using the RNA molecule according to claim 1 . 

21. (Amended) [Use of the RNA molecule according to any one of claims 1 to 
6, of the DNA sequence according to claim 7 or a fragment thereof, of the antibody or 
fragment thereof according to claim 16 or 17, or of the antisense RNA according to claim 
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18 or a fragment thereof] A method for the diagnosis of diseases which are connected with 
a disturbed control of gene expression comprising using the RNA molecule according to 
claim 1 . 

Claim 22, line 1, change "Use" to —The method— and delete "20 or". 

Claim 23, line 1, change "whose" to —comprising a— and after "gene" insert 
—which--. 

Claim 25, line 1, delete "or 24". 

26. (Amended) A process for the production of a non-human mammal according 
to [any one of claims 23 to 25] claim 23 . [characterized by] comprising the following steps: 

(a) [preparation of] preparing a DNA fragment, [in particular a vector,] 
containing a modified NINTROX gene, the NINTROX gene having been 
modified by deletion of a homologous sequence and/or insertion of a 
heterologous sequenced in particular a selectable marker]; 

(b) [preparation of] preparing embryonal stem cells from a non-human mammal 
[(preferably mouse)] ; 

(c) [transformation of] transforming the embryonal stem cells from step (b) with 
the DNA fragment from step (a), the NINTROX gene in the embryonal stem 
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cells being modified by homologous recombination with the DNA fragment 
from (a), 

(d) culturing the cells from step (c), 

(e) [selection of] selecting the cultured cells from step (d) for the absence of the 
homologous sequence and/or the presence of the heterologous sequence, [in 
particular the selectable marker,] 

(f) [production of] producing chimeric non-human mammals from the cells from 
step (e) by injection of these cells in mammalian blastocysts [(preferably 
mouse blastocysts)] , [transfer of] transferring the blastocysts into false- 
pregnant female mammals [(preferably mouse)] and [analysis of] analyzing 
the resulting offspring for a change of the NINTROX gene. 

REMARKS 

Entry of the foregoing amendments are respectfully requested. 
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Should the Examiner have any questions concerning the subject application, a 
telephone call to the undersigned would be appreciated. 

Respectfully submitted, 

Burns, Doane, Swecker & Mathis, L.L.P. 



By: sdffr**A ^2^- 



Teresa Stanek Rea f~Q (f^^> 

Registration No. 30,427 ' fk fdo- ri/r'c 



P.O. Box 1404 

Alexandria, Virginia 22313-1404 
(703) 836-6620 

Date: December 22, 2000 
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Modular ly Constructed RNA Molecules Having two Sequence 
i Region Types \ 



The present invention relates to RNA molecules which are 
characterized by two sequence region types, namely a first 
sequence region type which contributes to maintaining the 
three-dimensional structure of the RNA molecule, and a 
second sequence region type which is responsible for the 
specific binding of a ligand. These RNA molecules are 
preferably useful for the direct control of gene expression. 
The present invention also provides the DNA sequence derived 
for the RNA molecules according to the invention and vectors 
which contain them. In addition, the invention relates to 
drugs or medicaments and diagnostic compositions which 
contain the above RNA molecules or vectors, to an antibody 
specifically recognizing these RNA molecules or to antisense 
RNA specifically binding to these RNA molecules or ribozymes 
cleaving these RNA molecules. Furthermore, the invention 
relates to non-human transgenic mammals and cells obtained 
therefrom. 

Gene expression in eukaryotes is usually regulated via 
proteins which usually bind specifically to certain 
regulatory sequences upstream of the gene to be expressed 
and show a characteristic effect (RNA polymerases, 
transcription factors, receptors adapted to be activated by 
hormones, etc.). Only few examples of controlling the gene 
expression directly via RNA molecules have been known thus 
far. They include the RNA "XIST" responsible for the 
inactivation of the entire X chromosome ("X chromosome 
inactivation specific transcript"), an RNA referred to as 
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IPW ("imprinted in Prader-Willi syndrome") and RNA H19 which 
represents a tumor suppressor and is involved in the control 
of certain development processes. The artificial control of 
the gene expression has meanwhile been effected by the use 
of antisense RNAs binding specifically to mRNAs or by the 
use of catalytically active RNA molecules, what is called 
ribozymes, which do not only bind specifically to the target 
RNA but also cleave it thus inactivating it. However, the 
application possibilities for these antisense RNAs or 
ribozymes are limited, above all as regards the ligand to be 
bound and inactivated. This ligand may basically only be an 
RNA. 

Thus, there is a need for providing compounds which can 
universally detect, and/or inactivate, the most differing 
target molecules, e.g. DNA, RNA, proteins or low-molecular 
substances, and are suitable e.g. for controlling gene 
expression and thus, of course, also for preventing and 
treating diseases which are accompanied by a disturbed gene 
expression. 

Hence the technical problem of the invention is 
substantially to provide those compounds which are useful 
inter alia for the prevention or therapy (and also 
diagnosis) of such diseases. 

The solution to this technical problem was achieved by 
providing the embodiments characterized in the claims. 

The inventors could identify an RNA molecule which comprises 
the above described desired properties. This RNA molecule is 
encoded by the gene "NINTROX" (No INTRO ns X-chromosome) 
which has no introns, is localized on the X-chromosome and 
codes for no protein. This RNA molecule is part of certain 
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(relatively long) transcripts of the MeCP2 gene. The MeCp2 
gene (methyl-CpG binding protein 2) in Xq28 has a transcript 
of about 1.8 kb which codes for the MeCP2 protein. The above 
described RNA is part of relatively long MeCP2 transcripts 
which also code for the MeCP2 protein but have a different 
3' -non-translated region. This 3 ' -non-translated region is 
decisive for the MeCP2 gene and its function. The below 
expression "NINTROX" is synonymous with the above relatively 
long transcripts of the MeCP2 gene. 

The genomic sequence of the human NINTROX gene is shown in 
figure 1, and the genomic sequence of the murine NINTROX 
gene is illustrated in figure 2. In figure 3, a sequence 
comparison was carried out between human and murine 
sequences. It is obvious therefrom that there are some 
highly sequence-conserved regions which according to an 
energy analysis carried out by means of a computer 
distinguish themselves by a high degree of energy (cf. 
figure 4) . 

While the mechanism of action of the above discussed genes 
effective on the RNA level was fully unclear, the principle 
of action of such a gene which is described in more detail 
below could, for the first time, be determined by the 
analysis of the NINTROX gene. The NINTROX gene contributes 
essentially to the maintenance of the functions of the CNS, 
in particular the hippocampus. Defects in this gene result 
in limited CNS functions which reach as far as mental 
retardations. Furthermore, the NINTROX gene has an important 
function in the control of cell proliferation. In this 
connection, changes in this gene can lead to errors in the 
control of cell growth, e.g. to cancer. Changes in this gene 
may result in an increased or reduced DNA methylation. An 
increased DNA methylation can iriter alia restrict or prevent 
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the activity of growth-controlling genes (tumor suppressor 
genes) and thus result in a generally increased cancer rate. 
Reduced DNA methylation can lead inter alia to an 
overexpression of genes and thus to a disturbed development 
of " the cell or the whole organism. Further investigations 
led to the result that the expression pattern of the NINTROX 
gene is effected in tissue-specific and development-specific 
manner. The Northern analyses showed an expression in all 
investigated fetal and adult tissues. No sequence homologies 
with already known sequences could be detected. 

The strategy which led to the identification of this nucleic 
acid molecule is described below. Based on the systematic 
analysis of the q28 region of the human X chromosome various 
expressed sequences could be detected and isolated. By means 
of these expressed sequences some formerly unknown genes 
could be identified and characterized according to standard 
methods, inter alia the NINTROX gene on which the present 
invention is based. 

It is of interest that the NINTROX-RNA molecules according 
to the invention have a modular structure, i.e. they are 
characterized by the presence of two different sequence 
region types. While one sequence region permits to maintain 
the three-dimensional structure and, as follows from a 
comparison of the sequences from various species (human, 
hamster, kangaroo, macaque or macaca, orangutan chimpanzee 
and rat; cf . figure 5) , is conserved only in a qualified 
sense, the second sequence region which is responsible for 
the specific binding to the target molecule is sequence- 
conserved. Because of this modular construction of the 
NINTROX-RNA it is possible to modify it such that its effect 
is not only limited to the above described control of the 
gene expression but can be used for a number of 
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possibilities. In addition to the control of the gene 
expression it is also possible to modify the structure (e.g. 
chromatin structure, nuclear scaffold) of chromosomal 
regions by means of such modular RNA molecules. This offers 
the formerly unknown possibility of being able to influence 
the expression of relatively large genomic regions in well- 
calculated fashion. Thus, certain sequence regions of both 
modules of the NINTROX gene can be replaced by other 
sequences or even artificial sequences, so that (a) the 
interaction of this RNA with other binding partners (RNA, 
DNA, other macromolecules and low-molecular compounds) or 
their biochemical reaction (e.g. increase or decrease of the 
conversion rate) are changed in well-calculated fashion, and 
therefore the RNA molecule can be adapted in well-calculated 
fashion to novel tasks, and/or (b) the three-dimensional 
structure of the NINTROX-RNA can be adapted in well- 
calculated fashion to special demands. As a result, a 
partially or fully new function of the NINTROX-RNA molecule 
according to the invention can be obtained. 

Thus, an embodiment of the present invention relates to an 
RNA molecule which may bind to a ligand and comprises the 
following sequence regions: (a) a sequence region 
maintaining the three-dimensional structure of the RNA 
molecule, and (b) a sequence region for the specific binding 
of the ligand. 

The expression "a sequence region maintaining the three- 
dimensional structure of the RNA molecule" used herein has 
the following meaning. Three-dimensional RNA structures are 
rendered possible by base pairing of various bases within 
the RNA molecule. In this case, structures such as "stems" 
or "loops" are formed. Many of these structures yield in 
this way the overall structure of the RNA molecule. A 
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sequence change within the RNA molecule may remain without 
consequences for the spatial structure if the sequence 
change does not change the base pairings or if the sequence 
change is compensated by a second sequence change. For 
example, if the base pairing A-T is destroyed in that the A 
mutates into G, this mutation can be compensated by another 
mutation of T into C. Although this changes the sequence, 
the spatial structure remains the same. As a result, the 
same RNA structure can be formed by an extremely large 
number of differing RNA sequences. References to certain RNA 
structures follow from an analysis of the energy included 
therein. This analysis can be carried out by means of 
commercially available computer programs (e.g. "FOLD"; 
Michael Zuker and P. Stiegler: Optimal Computer Folding of 
Large RNA Sequences Using Thermodynamics and Auxiliary 
Information, Nucleic Acids Research (81), 9(1), page 133). 
The lower the energy content of a certain sequence, the more 
stable the three-dimensional RNA structures. The analysis of 
the NINTROX gene showed a conserved distribution of these 
low-energy structures (figure 4) . The base sequence of these 
RNA regions differs widely with various species, but the 
energy content is very conserved. In figure 3, these are the 
sequence regions which are not characterized by a black bar 
at the margin. This means that the sequence region 
maintaining the three-dimensional structure of the RNA 
molecule is not sequence-conserved but energy-conserved. For 
example, modifications of this sequence region do not orient 
themselves by the base sequence but by the conservation of 
the detected energy content. 

The expression "a sequence region for the specific binding 
of the ligand" used herein relates to a sequence region 
which is such that it can bind specifically the desired 
ligand. These sequence regions are highly sequence- 
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conserved. In figure 3, these regions are marked by a black 
bar at the margin and have a high energy content (cf. figure 
4). This tallies with the observation that these sequence 
regions are not "packed" but oriented outwardly and are 
responsible for the binding of the ligand, enzymatic 
reactions or the binding to other RNA or DNA sequences. If 
the ligand to be bound is an RNA molecule or a DNA molecule, 
this sequence region will be complementary to a 
corresponding, sufficiently long segment of the RNA molecule 
or DNA molecule. If the ligand to be bound is a protein, the 
sequence region (b) may be partially or fully exchanged, or 
supplemented, by a DNA sequence which as is known binds 
specifically the desired protein. 

The two above-described sequence types occur several times 
within the NINTROX-RNA. The exchange or the change of 
individual ones of such modules enables the well-calculated 
change of the NINTROX-RNA. In a modification of the module 
maintaining the three-dimensional structure attention has to 
be paid to the energy content, so that it maintains a 
minimum value. The modification of the other sequence region 
is only subject to minor restrictions even though it is 
deemed to be sequence-conserved. This region may be omitted 
fully or partially or may contain insertions. For example, 
it is also possible to insert sequences into the NINTROX-RNA 
molecule which have known biochemical properties or bind 
certain DNA molecules, RNA molecules or proteins. In 
addition, random sequences of differing length may be 
introduced into various sites of the NINTROX gene and 
thereafter selection for specific properties such as 
biochemical reaction, specific binding, etc. , may be carried 
out . 



In a preferred embodiment of the RNA molecule according to 
the invention the sequence region (a) comprises the sequence 
regions not marked at the margin in figure 3 or sequences 
related thereto which also permit the maintenance of the 
three-dimensional structure of the RNA molecule and differ 
from sequence region (a) in figure 3. These differences 
relate to the addition, deletion and/or insertion of bases, 
at least 80 %, preferably 85 %, and more preferably at least 
90 %, of the energy content determined for the sequence of 
figure (3) being maintained. The original three-dimensional 
structure is preferably maintained when these changes are 
introduced. 

In a particularly preferred embodiment, the sequence region 
(b) of the RNA molecule according to the invention comprises 
the sequences which are illustrated in figure 3 and marked 
with black bars at the margin. 

In another preferred embodiment of the RNA molecule 
according to the invention, the ligand to be bound is a DNA 
molecule or a protein or enzyme, e.g. DNA polymerase I. The 
RNA molecule according to the invention preferably contains 
a poly (A) sequence at the 3' end, which may contribute to 
the stability in a desired host cell. 

In another preferred embodiment, the RNA molecule according 
to the invention is used to control the gene expression. For 
this purpose, the sequence region (b) is modified such that 
it binds a protein responsible for gene expression or binds 
to a certain DNA region of the target gene so as to impede 
or prevent e.g. the attachment of proteins which exert an 
influence inhibiting or supporting gene expression or also 
binds directly to the mRNA of the target gene so as to 
impede or prevent the translation, for example. The person 
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skilled in the art can readily modify the RNA molecule 
according to the invention by corresponding modification of 
sequence region (b) and possibly also of sequence region (a) 
such that it binds the desired ligand and therefore controls 
the gene expression to the desired extent. 

The present invention also relates to a DNA sequence coding 
for the RNA molecule according to the invention and to a 
gene comprising the following features: It contains a 
promoter which permits the transcription in a desired host 
cell and a DNA sequence functionally linked therewith and 
encoding the RNA molecule according to the invention. The 
gene preferably contains additionally a termination signal 
and a polyadenylation site. 

In a preferred embodiment the gene according to the 
invention comprises the sequence shown in figure 1 or 2. 

The DNA sequences or genes, coding for the RNA molecule 
according to the invention, may also be inserted in a 
vector. Thus, the present invention also comprises vectors 
containing these DNA sequences or genes. The term "vector" 
relates to a plasmid (e.g. pUC18, pBR322, pBlueScript) , to a 
virus or another suitable vehicle. In a preferred 
embodiment, the sequence coding for the DNA molecule 
according to the invention is functionally linked in the 
vector with regulatory elements which permit its expression 
in prokaryotic or eukaryotic host cells. In addition to the 
regulatory elements, e.g. a promoter, such vectors typically 
contain a replication origin and specific genes which permit 
the phenotypic selection of a transformed host cell. The 
regulatory elements for the expression in prokaryotes, e.g. 
E. coli, comprise the lac, trp promoter or T7 promoter, and 
those for the expression in eukaryotes comprise the A0X1 or 
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GAL1 promoter in yeast and those for the expression in 
animal cells comprise the CMV, SV40, RVS-40 promoter, CMV or 
SV4 0 enhancer. Further examples of suitable promoters are 
the metallothionein I and the polyhedrin promoters. Suitable 
vectors are e.g. expression vectors, based on T7, for the 
expression in bacteria (Rosenberg et al . , Gene 56 (1987), 
125) , pMSXND for the expression in mammalian cells (Lee and 
Nathans, J. Biol. Chem. 263 (1988), 3521) and vectors 
derived from baculovirus for the expression in insect cells. 

In a preferred embodiment, the vector containing the 
sequences coding for the RNA molecules according to the 
invention is a viral vector, e.g. a vaccinia virus or 
adenovirus, which is of use for a gene therapy. RNA viruses, 
above all retroviruses, are particularly preferred. Examples 
of suitable retroviruses are MoMuLV, HaMuSV, MuMTV, RSV or 
GaLV. For the purpose of gene therapy the RNA molecules 
according to the invention can be transported to the target 
cells in the form of colloidal dispersions as well. They 
comprise e.g. liposomes of lipoplexes (Mannino et al., 
Biotechniques 6 (1988), 682). 

General methods known in the art can be used for 
constructing expression vectors which contain the sequences 
coding for the RNA molecules according to the invention and 
suitable control sequences. These methods comprise e.g. in 
vitro recombination techniques, synthetic methods and in 
vivo recombination methods, as described in Sambrook et al., 
for example. 

The present invention also relates to host cells containing 
the above described vectors. These host cells comprise 
bacteria, yeast, insect and animal cells, preferably 
mammalian cells. Preferred mammalian cells are CHO, VERO, 
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BHK, HeLa, COS, MDCK, 293 and WI38 cells. Methods of 

transforming these host cells, of phenotypically selecting 
trans formants and expressing the nucleic acid molecules 
according to the invention using the above described vectors 
are known in the art. 

The present invention also relates to antibodies which 
detect specifically the RNA molecule according to the 
invention. The antibodies may be monoclonal, polyclonal or 
synthetic antibodies or fragments thereof, e.g. Fab, Fv or 
scFv fragments. In this case, a monoclonal antibody is 
preferably concerned. The antibodies according to the 
invention may be produced according to standard methods, the 
RNA molecule according to the invention or a fragment 
thereof serving as an immunogen. Monoclonal antibodies may 
be produced e.g. by the method described by Kohler and 
Milstein (Nature 256 (1975), 495) and Galfre (Meth. Enzymol. 
73 (1981), 3), mouse myeloma cells being fused with 
immunized mammalian spleen cells. These antibodies may be 
used e.g. to inhibit the activity of the RNA molecules 
according to the invention, e.g. to influence the gene 
expression. The antibodies may also be used in diagnostic 
assays, for example, so as to prove whether dysregulation of 
the gene expression is accompanied e.g. by a loss or lack of 
responsible NINTROX-RNA. The antibodies may be present in 
immunoassays in liquid phase or be bound to a solid carrier. 
In this connection, the antibodies may be labeled in various 
ways. Suitable markers and labeling methods are known in the 
art. Examples of immunoassays are ELISA and RIA. 

The invention also relates to antisense RNAs which bind 
specifically to an RNA molecule according to the invention 
and may be used in vitro or in vivo to reduce the expression 
of genes controlled directly by RNA, e.g. NINTROX-RNA. The 
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administration of the antisense RNA according to the 
invention to a target cell results in a reduced gene 
expression and is particularly useful for treating diseases 
which are characterized by an excessively great gene 
expression of the directly RNA-controlled gene (e.g. cancer 
diseases) . In this connection, the antisense RNAs can be 
administered directly or as a DNA encoding the same, 
preferably inserted in a suitable vector. The suitable 
vectors comprise all of the vectors described above already 
in connection with the RNA molecules according to the 
invention . 

The antisense RNAs according to the invention comprise an 
antisense sequence having at least 7 to 10 or more 
nucleotides which hybridize specifically with a sequence of 
the RNA molecule according to the invention, e.g. NINTROX- 
RNA. The antisense RNA according to the invention preferably 
has a length of about 10 to about 50 nucleotides or of about 
14 to about 35 nucleotides. In further embodiments, the 
antisense RNAs according to the invention are RNAs shorter 
than about 100 nucleotides or shorter than about 200 
nucleotides. In general, the antisense RNAs should be long 
enough to form a stable double helix but short enough 
(depending on the kind of supply) to be administered in 
vivo, if desired. In general, the antisense sequence is 
substantially complementary to the target sequence to ensure 
specific hybridization. In certain embodiments the antisense 
sequence is directly complementary to the target sequence. 
However, the antisens.e RNAs may also contain nucleotide 
substitutions, additions, deletions, transitions, 

transpositions or modifications as long as the specific bond 
to the relevant target sequence is maintained as a 
functional property of the antisense RNA. The antisense RNAs 
may also contain further sequences in addition to the 



13 



antisense sequences. The antisense RNAs (and the RNA 
molecules according to the invention) can be produced using 
any method suitable for the production of nucleic acids, 
e.g. by chemical synthesis de novo or by cloning. An 
antisense RNA may also be produced e.g. by inserting in a 
vector (e.g. a plasmid) a sequence of the target RNA or a 
fragment thereof in reverse orientation functionally linked 
with a promoter. Provided that the promoter and preferably 
termination and polyadenylation signals are positioned 
correctly, the strand of the inserted sequence is 
transcribed which corresponds to the non-coding strand 
acting as an antisense RNA. 

The present invention also relates to ribozymes which cleave 
specifically the RNA molecules according to the invention 
and thus are also of use for inhibiting the gene expression. 
Useful ribozymes may comprise 5' -terminal and 3' -terminal 
sequences which are complementary to the target RNA, and 
they can be constructed by a person skilled in the art 
according to standard methods (see e.g. PCT publication WO 
83/23572) . The ribozymes according to the invention comprise 
e.g. ribozymes having the features of group I intron 
ribozymes (Cech, biotechnology 13 (1995), 323) and 
"hammerhead" ribozymes (Edgington, Biotechnology 10 (1992), 
256) . 

In one embodiment, the ribozymes according to the invention 
per se are used as drugs. In another embodiment, gene 
therapy methods are employed for the expression of ribozymes 
in a target cell ex vivo or in vivo. The methods of 
administering the ribozymes or of expressing the ribozymes 
in vivo correspond to the methods described above in 
connection with the RNA molecules according to the 
invention . 
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The isolation and characterization of the human NINTROX gene 
and in particular the mouse homolog of the NINTROX gene 
allow to establish an animal model which permits to provide 
therapies and drugs for the above discussed diseases. 
Providing the sequence of the NINTROX gene enables both 
diagnosis (post-natally or pre-natally) and therapy of 
diseases in which the gene expression is characterized by 
the lack of NINTROX-RNA or an excess of NINTROX-RNA. 
However, the therapeutic or diagnostic application is not 
only limited to diseases, which are accompanied by a 
dysregulation of the expression of a gene controlled by 
NINTROX-RNA but the RNA molecules modified in accordance 
with the above described possibilities also offer the chance 
of using completely new therapeutic agents. 

Therefore, the present invention also relates to drugs which 
contain the above described RNA molecules, vectors, 
antibodies, antisense RNAs or ribozymes . These drugs 
optionally contain additionally a pharmaceutically 
acceptable carrier. The person skilled in the art is 
familiar with suitable carriers and the formulation of such 
drugs. Suitable carriers include e.g. phosphate-buffered 
common salt solutions, water, emulsions, e.g. oil-in-water 
emulsions, wetting agents, sterile solutions, etc. The drugs 
can be administered orally or parenterally . The topical 
intra-arterial (e.g. directly to the tumor), intramuscular, 
subcutaneous, intramedullary, intrathecal, intraventricular, 
intravenous, intraperitoneal or intranasal administration 
belong to the methods for the parenteral administration. A 
suitable dose is determined by the attending physician and 
depends on various factors, e.g. on the age, sex, patient's 
weight, stage of a tumor, kind of administration, etc. 
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The drug according to the invention is used preferably for 
preventing or treating diseases which are correlated with a 
disturbed control of gene expression. The drug according to 
the invention is used particularly preferably for treating 
tumoral diseases or diseases of the CNS. In this connection, 
the drug may be used in gene therapy, the above described 
methods or vectors being usable for introducing the nucleic 
acids according to the invention. On the other hand, the RNA 
molecule according to the invention may be administered 
directly so as to restore normal expression of the gene in 
cells which no longer have functional copies of the RNA 
molecule . 

The present invention also relates to a diagnostic 
composition which contains the RNA molecule according to the 
invention, to the DNA sequence coding for it or a fragment 
thereof, to the antibody according to the invention or a 
fragment thereof, or to the antisense RNA according to the 
invention or a fragment thereof, or to combinations thereof, 
optionally together with a suitable analytical reagent. By 
means of this diagnostic composition the detection may be 
made as to whether the RNA directly controlling the gene 
expression, e.g. NINTROX-RNA, is present or, as compared to 
a control, is available in excessively high or low 
concentration or with a deviating length. In this 
connection, the antibody or a fragment thereof is preferably 
used in the above described assays or the antisense RNA or a 
fragment thereof as a probe in hybridization experiments. 
For this purpose, the probe preferably has a length of at 
least 10, more preferably at least 15, bases. Suitable 
detection methods based on hybridization are known to the 
person skilled in the art. Suitable labeling for the probe 
are also known to the person skilled in the art and they 
comprise e.g. labeling using radioisotopes, bioluminescence, 
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cbemiluminescence, fluorescence markers, metal chelates, 
enzymes, etc. This process may use methods known to the 
person skilled in the art as regards the preparation of 
whole RNA or poly (A) +RNA from biological samples, the 
separation of the RNAs on gels separating according to size, 
e.g. denaturing agarose gels, the production and labeling of 
the probe and the detection of the hybrids, e.g. via 
"Northern blot". In this connection, diseases are preferably 
diagnosed as described above in connection with the drugs 
according to the invention. 

A diagnosis can also be made on a DNA level. In this 
connection, the intactness of the gene which codes for the 
RNA which is directly involved in the regulation of gene 
expression, e.g. NINTROX-RNA, is investigated by the above 
described nucleic acid molecules (e.g. as regards the 
availability, length or mutations) . For this process it is 
possible to use methods with which the person skilled in the 
art is familiar as to the preparation of DNA from biological 
samples, the restriction digestion of the DNA, the 
separation of the restriction fragments on gels separating 
according to size, e.g. agarose gels, the production and 
labeling of the probe and the detection of hybridization, 
e.g. via "Southern blot". The above detection can also be 
carried out via PCR. In this connection, primers are used 
which flank the coding sequence. Here, amplification 
products of DNA from the tissue in question, which differ 
e.g. as regards the length or sequence from the 
amplification products of DNA from healthy tissue, are of 
diagnostic significance. 

The subject matter of the present invention also relates to 
a non-human mammal whose NINTROX gene is modified, e.g. by 
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insertion of a heterologous sequence, in particular a 
selection marker sequence. 

The expression "non-human mammal" comprises any mammal whose 
NINTROX gene may be modified. Examples of such mammals are 
mouse, rat, rabbit, horse, cow, sheep, goat, monkey, pig, 
dog and cat, with mouse being preferred. 

The expression "NINTROX gene which is modified" signifies 
that in the NINTROX gene naturally occurring in a human 
mammal a deletion of about 1 to 2 kb is carried out by 
standard methods. If desired, a heterologous sequence, e.g. 
a construct for mediating antibiotic resistance (e.g. a "neo 
cassette") can be inserted in this deletion. This method is 
generally described in Schwartzberg et al., Proc. Natl. 
Acad. Sci. USA, Vol. 87, pp. 3210-3214, 1990, to which 
reference is made. 

A further subject matter of the present invention relates to 
cells which are obtained from the above non-human mammal. 
These cells may be present in any form, e.g. in a primary or 
long-term culture. 

A non-human mammal according to the invention can be 
provided by common methods. A method is favorable which 
comprises the steps of: 

(a) preparation of a DNA fragment, in particular a vector, 
containing a modified NINTROX gene, the NINTROX gene 
having been modified by deletion of a homologous 
sequence and/or insertion of a heterologous sequence, 
in particular a selectable marker; 
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(b) preparation of embryonal stem cells from a non-human 
mammal (preferably mouse) ; 

(c) transformation of the embryonal stem cells of step (b) 
with the DNA fragment from step (a), the NINTROX gene 
in the embryonal stem cells being modified by 
homologous recombination with the DNA fragment from 
(a) ; 

(d) culturing the cells from step (c) ; 

(e) selection of the cultured cells from step (d) for the 
absence of the homologous sequence and/or the presence 
of the heterologous sequence, in particular the 
selectable marker, 

(f) production of chimeric non-human mammals from the cells 
from step (e) by injection of these cells in mammalian 
blastocysts (preferably mouse blastocysts) , transfer of 
the blastocysts in pseudo-pregnant female mammals 
(preferably mouse) and analyses of the resulting 
offspring for a modification of the NINTROX gene. 

The mechanism of the homologous recombination (cf. R.M. 
Torres, R. Kuhn, Laboratory Protocols for Conditional Gene 
Targeting, Oxford University Press, 1997) is used in step 
(c) to transfect embryonal stem cells. The homologous 
recombination between the DNA sequences present in a 
chromosome and new, added cloned DNA sequences enables the 
insertion of a cloned gene in the genome of a living cell in 
place of the original gene. By this method it is possible to 
obtain via chimeras animals which are homozygous for the 
desired gene or the desired gene portion of the desired 
mutation when embryonal germ cells are used. 
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The expression "embryonal stem cells" comprises any 
embryonal stem cells of a non-human mammal which are 
suitable for the mutation of the NINTROX gene. The embryonal 
mouse stem cells, in particular cells E14/1 or 129/SV, are 
preferred. 

The term "vector" comprises any vector which by 
recombination with the DNA of embryonal stem cells enables a 
modification of the NINTROX gene. The vector preferably has 
a marker with which it is possible to select for present 
stem cells in which the desired recombination was made. Such 
a marker is e.g. the loxP/tkneo cassette which by means of 
the Cre/loxP system can be removed from the genome again. 

In addition, the person skilled in the art knows conditions 
and materials to carry out steps (a) to (f) . 

A non-human mammal is provided by the present invention 
whose NINTROX gene is modified. This modification can be an 
elimination of the gene expression-regulatory function. By 
means of such a mammal or cells therefrom the gene 
expression-controlling function of NINTROX can be 
investigated selectively. Furthermore, it is possible to 
find substances, drugs and therapy approaches by which a 
selective influence can be exerted on the controlling 
function of NINTROX. Therefore, the present invention 
furnishes a basis for influencing the most varying diseases. 
Such diseases are e.g. limitations of the CNS functions 
which reach as far as mental retardation or the induction of 
cancer resulting from mistakes made in the control of cell 
proliferation. Furthermore, it should be possible to 
investigate in more detail and characterize the part of the 
hippocampus . 
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The following clones were deposited with DSMZ, Deutsche 
Sammlung von Mikroorganismen und Zellkulturen GmbH [German- 
type collection of micro-organisms and cell cultures] , 
Mascheroder Weg lb, D-38124 Braunschweig, on May 4, 1998: 

DSM 12153: E. coli JFC-484, partial sequence of the 

human NINTROX-cDNA 
DSM 12154: E. coli JFC-622, partial sequence of the 

murine NINTROX-cDNA 
DSM 12155: E. coli JFC-8D3, sequence of the human 

genomic NINTROX-DNA 
DSM 12156: E. coli JFC-P1-165, sequence of the 

murine genomic NINTROX-DNA 

The figures show: 

Figure 1: human sequence of the NINTROX gene 
Figure 2: murine sequence of the NINTROX gene 

Figure 3: sequence comparison between human (top) and murine 
(bottom) sequences 

Solid bars: sequence-conserved regions (b) 

Figure 4: energy diagram of the sequences from figure 3 

Figure 5: homology comparison of NINTROX from various 
species 

Figure 5a: partial sequence from hamster 

Figure 5b: partial sequence from kangaroo 

Figure 5c: partial sequence from macaca 

Figure 5d: partial sequence from orangutan 

Figure 5e: partial sequence from rat 
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Figure 5f: partial sequence from chimpanzee 

The following example explains the invention: 

Example 1: Identification and Characterization of the 

NINTROX Gene 

For the identification of transcribed sequences from the 
region Xq2-7.3 to Yqter, whole RNA was initially isolated 
from various pig tissues (kidney, heart, spleen, liver, 
brain, etc. ) and transcribed by means of oligo-dT into first 
strand cDNA. These complex cDNA samples which represent all 
of the genes transcribed in the respective tissue were then 
labeled radioactively and hybridized with the Xq27 . 3-Xqter- 
specific cosmid library. The cosmid library was in this 
connection analyzed in the form of cosmid clones arranged 
systematically on nylon membranes. Then, the cosmid DNA was 
isolated by the cosmid clones which had positive 
hybridization signals with the complex cDNA samples, was 
digested using EcoRI, separated by gel electrophoresis and 
transferred to nylon membranes. The restriction fragments 
which then had a positive hybridization with the complex, 
radioactively labeled cDNA samples were subsequently 
isolated and labeled radioactively and used for screening a 
fetal human cDNA library. By this, positive cDNA clones 
could be isolated which represented the transcript of the 
NINTROX gene. 
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Claims 

1. An RNA molecule which can bind to a ligand and 
comprises the following sequence regions: 

(a) a sequence region maintaining the three- 
dimensional structure of the RNA molecule; and 

(b) a sequence region for the specific binding of the 
ligand. 

2. The RNA molecule according to claim 1, wherein sequence 
region (a) comprises the DNA sequence shown in fig. 3 
without bars at the margin or a sequence which is 
related thereto and also permits the maintenance of the 
three-dimensional structure of the RNA molecule. 

3. The RNA molecule according to claim 1 or 2, wherein 
sequence region (b) comprises the DNA sequence shown in 
fig. 3 with bars at the margin. 

4. The RNA molecule according to any one of claims 1 to 3, 
wherein the ligand is a DNA molecule or a protein. 

5. The RNA molecule according to any one of claims 1 to 4, 
which additionally contains a poly (A) sequence at the 
3' end. 

6. The RNA molecule according to any one of claims 1 to 5 
for the control of gene expression. 

7. The DNA sequence which codes for an RNA molecule 
according to any one of claims 1 to 6. 

8. A gene which comprises the sequence shown in fig. 1 or 
2. 
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9. A vector which comprises the DNA sequence according to 
claim 7 or the gene according to claim 8. 

10. The vector according to claim 9, wherein the vector is 
a plasmid. 

11. The vector according to claim 10, wherein the vector is 
a viral vector. 

12. The vector according to claim 11, which is an RNA 
virus . 

13. The vector according to claim 12, which is a 
retrovirus . 

14. The host cell, containing the vector according to any- 
one of claims 9 to 13. 

15. The host cell according to claim 14, wherein the host 
cell is a mammalian cell. 

16. An antibody or a fragment thereof, which bind 
specifically an RNA molecule according to any one of 
claims 1 to 6. 

17. The antibody according to claim 16, wherein the 
antibody is a monoclonal antibody. 

18. An antisense RNA which binds specifically to an RNA 
molecule according to any one of claims 1 to 6. 

19. A ribozyme which cleaves specifically an RNA molecule 
according to any one of claims 1 to 6. 
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20. Use of the RNA molecule according to any one of claims 
1 to 6, of the vector according to any one of claims 9 
to 13, of the antibody or fragment thereof according to 
claim 16 or 17, of the antisense RNA according to claim 
18 or of the ribozyme according to claim 19 for the 
production of a pharmaceutical preparation for 
preventing or treating diseases which are connected 
with a disturbed control of gene expression. 

21. Use of the RNA molecule according to any one of claims 
1 to 6, of the DNA sequence according to claim 7 or a 
fragment thereof, of the antibody or fragment thereof 
according to claim 16 or 17, or of the antisense RNA 
according to claim 18 or a fragment thereof for the 
diagnosis of diseases which are connected with a 
disturbed control of gene expression. 

22. Use according to claim 20 or 21, wherein the disease is 
a tumoral disease or a disease of the central nervous 
system. 

23. A non-human mammal whose NINTROX gene is modified by 
deletion of a homologous sequence and/or insertion of a 
heterologous sequence. 

24. The non-human mammal according to claim 23, wherein the 
heterologous sequence is a selection marker sequence. 

25. The non-human mammal according to claim 23 or 24, 
wherein the selection marker sequence conveys 
resistance to neomycin. 
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26. A process for the production of a non-human mammal 
according to any one of claims 23 to 25, characterized 
by the following steps: 

(a) preparation of a DNA fragment, in particular a 
vector, containing a modified NINTROX gene, the 
NINTROX gene having been modified by deletion of a 
homologous sequence and/or insertion of a 
heterologous sequence, in particular a selectable 
marker; 

(b) preparation of embryonal stem cells from a non- 
human mammal (preferably mouse) ; 

(c) transformation of the embryonal stem cells from 
step (b) with the DNA fragment from step (a) , the 
NINTROX gene in the embryonal stem cells being 
modified by homologous recombination with the DNA 
fragment from (a) , 

(d) culturing the cells from step (c) , 

(e) selection of the cultured cells from step (d) for 
the absence of the homologous sequence and/or the 
presence of the heterologous sequence, in 
particular the selectable marker, 

(f) production of chimeric non-human mammals from the 
cells from step (e) by injection of these cells in 
mammalian blastocysts (preferably mouse 
blastocysts) , transfer of the blastocysts into 
false-pregnant female mammals (preferably mouse) 
and analysis of the resulting offspring for a 
change of the NINTROX gene. 



Abstract of the Disclosure 



The invention relates to modularly constructed RNA molecules 
which can bind to a ligand and which are characterized by 
two sequence regions, namely a first sequence region which 
contributes to the maintenance of the three-dimensional 
structure of the RNA molecule, and a second sequence region 
which is responsible for the specific binding of the ligand. 
These RNA molecules, e.g. the NINTROX RNA, can be used for 
directly influencing the gene expression. The invention also 
relates to vectors containing the RNA molecules according to 
the invention as well as to medicaments and diagnostic 
compositions which contain said RNA molecules or vectors, to 
an antibody which specifically recognizes these RNA 
molecules or antisense RNA binding specifically to these RNA 
molecules, or to ribozymes cleaving these RNA molecules. In 
addition, the invention relates to non-human mammals whose 
NINTROX gene is modified by inserting a heterologous 
sequence and to cells obtained therefrom. 
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Human sequence of the non-coding RNA gene (including the 
putative promoter) 

1 CTTAGAGTTT CGTGGCTTCA GGGTGGGAGT AGTTGGAGCA TTGGGGATGT 

51 TTTTCTTACC GACAAGCACA GTCAGGTTGA AGACCTAACC AGGGCCAGAA 

101 GTAGCTTTGC ACTTTTCTAA ACTAGGCTCC TTCAACAAGG CTTGCTGCAG 

151 ATACTACTGA CCAGACAAGC TGTTGACCAG GCACCTCCCC TCCCGCCCAA 

201 ACCTTTCCCC CATGTGGTCG TTAGAGACAG AGCGACAGAG CAGTTGAGAG 

251 GACACTCCCG TTTTCGGTGC CATCAGTGCC CCG?C?kCAG CTCCCCCAGC 

3 01 TCCCCCCACC TCCCCCACTC CCAACCACGT TGGGACAGGG AGGTGTGAGG 

3 51 CAGGAGAGAC AGTTGGATTC TTTAGAGAAG ATGGATATGA CCAGTGGCTA 

401 TGGCCTGTGC GATCCCACCC GTGGTGGCTC AAGTCTGGCC CCACACCAGC 

451 CCCAATCCAA AACTGGCAAG GACGCTTCAC A3GACAGGAA AGTGGCACCT 

501 GTCTGCTCCA GCTCTGGCAT GGCTAGGAGG GGGGXG^CCC TTGAACTACT 

701 AGAGCACAGC GGGGTGAGAG GGATTCCTAA TCACTCAGAG CAGTCTGTGA 
751 CTTAGTGGAC AGGGGAGGGG GCAAAGGGGG AGGAGAAGAA AATGTTCTTC 

851 TTGAGTCTTC ATGTCCCCAC TTCAAAACAA ACAGATGCTC TGAGAGCAAA 

901 CTGGCTTGAA TTGGTGACAT TTAGTCCCTC AAGCCACCAG ATGTGACAGT 

951 GTTGAGAACT ACCTGGATTT GTATATATAC CTGCGC7TGT TTTAAAGTGG 

1001 GCTCAGCACA TAGGGTTCCC ACGAAGCTCC GAAACTCTAA GTGTTTGCTG 

1051 CAATTTTATA AGGACTTCCT GATTGGTTTC TCTTCTCCCC TTCCATTTCT 

1101 GCCTTTTGTT CATTTCATCC TTTCACTTCT TTCCCTTCCT CCGTCCTCCT 

1151 CCTTCCTAGT TCA.TCCCTTC TCTTCCAGGC AGCCGCGGTG CCCAACCACA 

1201 CTTGTCGGCT CCAGTCCCCA GAACTCTGCC TGCCCTTTGT CCTCCTGCTG 

1251 CCAGTACCAG CCCCACCCTG TTTTGAGCCC TGAGGAGGCC TTGGGCTCTG 

1301 CTGAGTCCAA CCTGGCCTGT CTGTGAAGAG CAAGAGAGCA GCAAGGTCTT 

1351 GCTCTCCTAG GTA.GCCCCCT CTTCCCTGGT AAGAAAAAGC AAAAGGCATT 

1401 TCCCACCCTG AACAACGAGC CTTTTCACCC TTCTACTCTA GAGAAGTGGA 

1451 CTGGAGGAGC TGGGCCCGAT TTGGTAGTTG AGGAAAGCAC AGAGGCCTCC 

1501 TGTGGCCTGC CAGTCATCGA GTGGCCCAAC AGGGGCTCCA TGCCAGCCGA 

1551 CCTTGACCTC ACTCAGAAGT CCAGAGTCTA GCGTAGTGCA GCAGGGCAGT 

1601 AGCGGTACCA ATGCAGAACT CCCAAGACCC GAGCTGGGAC CAGTACCTGG 

1551 GTCCCCAGCC CTTCCTCTGC TCCCCCTTTT CCCTCGGAGT TCTTCTTGAA 

Fig. 1 
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1701 TGGCAATGTT TTGCTTTTGC TCGATGCAGA CAGGGGGCCA GAACACCACA 
1751 CA.TTTCACTG TCTGTCTGGT CCA.TAGCTGT GGTGTAGGGG CTTAGAGGCA 
1801 TGGGCTTGCT GTGGGTTTTT AATTGATCAG TTTTCA.TGTG GGATCCCA.TC 
1851 TTTTTAACCT C TG TTC AGGA. AGTCCTTATC TAGCTGCATA TCTTCATCAT 

1901 ATTGGTATAT CCTTTTCTGT GTTTA.CAGAG ATGTCTCTTA TA.TCTAAATC 

1951 TGTCCAACTG AGAAGTACCT TATCAAAGTA GCAAATGAGA CAGCAGTCTT 

2001 ATGCTTCCAG AAACACCCAC AGGCATGTCC CATGTGAGCT GCTGCCATGA 

2051 ACTGTCAAGT GTGTGTTGTC TTGTGTATTT CAGTTATTGT CCCTGGCTTC 

2101 CTTA-CTA-TCG TGTAATCATG AAGGAGTGAA. ACATCA.TAGA. AACTGTCTAG 

2151 CACTTCCTTG CCAGTCTTTA GTGATCAC-GA A.CCATAGTTG ACAGTTCCAA 

22 51 TGAGGGAGTT TGCCCCGTTC TGTTTGTAGA GTCTCATAGT TGGA.CTTTCT 

22 01 AGCA-TATA.TG TGTCCATT7C CTTATGCTGT AAAAGCAAGT CCTGCAACCA 

2451 GGTCAC-AAGA C-AGGGTGAGT CCTCCAGAAC TCTTCCTCCA AGGACAGAAG 

2501 GCTCCTGCCC CCA.TAGTGGC CTCGAACTCC TGGCACTACC AAAGGACAC? 

2 551 TATCCA.CGAG AGCGCAGCAT CCGACCAGGT TGTCACTGAG AAGA.TGTTTA 

2 502 TTTTGGTCAG TTGGGTTTTT A.TGTATTA.TA CTT AG TC AAA TGTAA.TGTGG 

2701 GGTCCTGGTA AGAGGAGTGC GTGGCCCACC AGGCCCCCCT GTCACCCATG 

2751 ACAGTTCATT CA.GGGCCGAT GGGGCAGTCG TGGTTGGGAA CACAGCATTT 

2 301 CAAGCGTCAC TTTATTTCAT TCGGGCCCCA CCTGCAGCTC CCTCAAAGAG 

2S51 GCAGTTGCCC AGCCTCTTTC CCTTCCAGTT TATTCCAGAG CTGCCAGTGG 

2S51 TCCCTCGTCT TTCCCAAAGG CATCACGAGT CAGTCGCCTT TCAGCAGGCA 

30 CI GCCTTGGCGG TTTA.TCGCCC TGGCAGGCAG GGGCCCTGCA GCTCTCATGC 

30 51 TGCCCCTGCC TTGGGGTCAG GTTGACAGGA GGTTGGAGGG AAAGCCTTAA 

3101 GCTGCAGGAT TCTCACCAGC TGTGTCCGGC CCAGTTTTGG GGTCTGACCT 

3151 CAATTTCAAT TTTGTCTGTA C TTGAAC ATT A.TGAA.GA.TGG GGGCCTCTTT 

32 01 CAGTGAATTT GTGAACAGCA GAATTGACCG ACAGCTTTCC AGTACCCATG 
3251 GGGCTAGGTC ATTAAGGCCA CATCCACAGT CTCCCCCACC CTTGTTCCAG 

33 01 TTGTTAGTTA CTACCTCCTC TCCTGACAAT ACTGTATGTC GTCGAGCTCC 
33 51 CCCCAGGTCT ACCCCTCCCG GCCCTGCCTG CTGGTGGGCT TGTCATAGCC 
3401 AGTGGGATTG CCGGTCTTGA CAGCTCAGTG AGCTGGAGAT AC TTGGTCAC 

Fig. 1 (cont'd 1) 




3/21 

3 451 AGCCAGGCGC TAGCACAGCT CCCTTCTGTT GATGCTGTAT TCCCATATCA 

3501 AAAGGCACAG GGGACACCCA GAAACGCCAC ATCCCCCAAT CCATCAGTGC 

3551 CAAACTAGCC AACGGCCCCA GCTTCTCAGC TCGCTGGATG GCGGAAGCTG 

3601 CTACTCGTGA GCGCCAGTGC. GGGTGCAGAC AATCTTCTGT TGGGTGGCAT 

1651 CATTCCAGGC CCGAAGCATG AACAGTGCAC CTGGGACAGG GAGCAGCCCC 

37 01 AAATTGTCAC CTGCTTCTC? GCCCAGCTT? TCATTGCTGT GACAGTGATG 
3751 GCGAAAGAGG GTAATAACCA GACACAAACT GCCAAGTTGG GTGGAGAAAG 

38 01 GAGTTTCTTT AGCTGACAGA ATCTCTGAAT TTTAAATCAC TTAGTAAGCG 
3 3 51 GCTCAAGCCC AGGAGGGAGC AGAGGGATAC GAGCGGAGTC CCCTGCGCGG 
3 9 01 GACCATCTGG AATTGGTTTA GCCCAAGTGG AGCCTGACAG CCAGAACTC? 

4001 CTCTGGGCTG ACTGGGCCAG GGGAGGTTAC AGGTACCAGT TCTTTAAGAA 

4051 G?.TCTTTGGG CATATACATT TTTAGCCTGT GTCATTGCCC CAAATGGATT 

4101 CCTGTTTCAA GTTCACACCT GCAGATTCTA GGACCTGTGT CCTAGACTTC 

4151 AGGGAGTCAG CTGTTTCTAG AGTTCCTACC ATGGAGTGGG TCTGGAGGAC 

4251 CTCTCTGCTC TGACGGGATT TGTTGATTC? CTCCATTTTG GTGTCTTTCT 

43 51 AATCGTTAGG ATACTGCCTC CCCCAGGGTC TAAAATTACA TAT7AGAGGG 

4401 GAAAAGCTGA ACACTGAAGT CAGTTCTCAA CAATTTAGAA GGAAAACCTA 

45 01 CAAGCTTTTA CAACAGTGCT GATCTAAAAA TACTTAGCAC TTGGC CTGAG 

4551 ATGCCTGGTG AGCATTACAG GCAAGGGGAA TCTGGAGGTA GCCGACCTGA 

4501 GGACATGGCT TCTGAACCTG TCTTTTGGGA GTGGTA.TGGA AGGTGGAGCG 

4551 TTCACCAGTG AC CTGGAAGG CCCAGCACCA CCCTCCTTCC CACTCTTCTC 

47 01 ATCTTGACAG AGCCTGCCCC AGCGCTGACG TGTCAGGAAA ACACCCAGGG 

47 51 AACTAGGAAG GCACTTCTGC CTGAGGGGCA GCCT3CCTTG CCCACTCCTG 

4801 CTCTGCTCGC CTCGGATCAG CTGAGCCTTC TGAGCTGGCC TCTCACTGCC 

4851 TCCCCAAGGC CCCCTGCCTG CCCTGTCAGG AGGCAGAAGG AAGCAGGTGT 

49 01 GAGGGCAGTG CAAGGAGGGA GCACAACCCC CAGCTCCCGC TCCGGGCTCC 

4951 GACTTGTGCA CAGGCAGAGC CCAGACCCTG GAGGAAATCC TACCTTTGAA 

5001 TTCAAGAACA TTTGGGGAAT TTGGAAATCT CTTTGCCCCC AAACCCCCAT 

5051 TCTGTCCTAC C TTTAATC AG GTCCTGCTCA GCAGTGAGAG CAGATGAGGT 

5101 GAAAAGGCCA AGAGGTTTGG CTCCTGCCCA CTGATAGCCC CTCTCCCCGC 

5151 AGTGTTTGTG TGTCAAGTGG CAAAGCTGTT CTTCCTGGTG ACCCTGATTA 

5201 TATCCAGTAA CACATAGACT GTGCGCATAG GCCTGCTTTG TCTCCTCTAT 

Fig. 1 (cont'd 2) 
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5251 CCTGGGCTTT TGTTTTGCTT TTTAGTTTTG CTTTTAGTTT TTCTGTCCCT 

53 01 TTTATTTAAC GCACCGACTA GACACACAAA GCAGTTGAAT TTTTATATAT 

53 51 ATATCTGTAT ATTGCACAAT TATAAACTCA TTTTGCTTGT GGCTCCACAC 

54 01 ACACAAAAAA AGACCTGTTA AAATTATACC TGTTGCTTAA TTACAATATT 
5451 TCTGATAACC ATAGCATAGG ACAAGGGAAA ATAJLAAAAAG AJLAAAAAAGA 
5 5 01 AJULAAAAACG ACAAA.TCTGT CTGCTGGTCA CTTCTTCTGT CCAAGCAGAT 

55 51 TCGTGGTCTT TTCCTCGCTT CTTTCAAGGG CTTTCCTGTG CCAGGTGAAG 
5501 GAGGCTCCAG GCAGCACCCA GGTTTTGCAC TCTTGTTTCT CCCGTGCTTG 
5 651 TGAAAGAGGT CCCAAGGTTC TGGGTGCAGG AGCGCTCCCT TGACCTGCTG 
5701 AAGTCCGGAA CGTAGTCGGC ACAGCCTGG? CGCCTTCCAC CTCTGGGAGC 
5751 TGGAGTCCAC TGGGGTGGCC TGACTCCCCC AGTCCCCTTC CCGTGACCTG 
5 301 GTCAGGGTGA GCCCATGTGG AGTCAGCCTC GCAGGCCTCC CTGCCAGTAG 
5851 GGTCCGAGTG TGTTTCA.TCC TTCCCACTC? GTCGAGCCTG GGGGCTGGAG 

5551 GAACGCCAGG GACCCCAGAA TCATGTC-CGT CAGTCCAAGG GGTCCCCTCC 

6001 AGGAGTAGTG AAGACTCCAG AAATGTCCCT ^C^CjCCC CCATCCTAC3 

6101 GTTTAGCTGT AACAGTTCTT TTTGATCATC TTTTTTTAAT AATTAGAAAC 

£151 ACCAAAAAAA TCCAGAAACT TGTTCTTCCA AAGCAGAGAG CATTATAATC 

62 01 ACCAGGGCCA AAAGCTTCCC TCCCTGCTG? CATTGC7TCT TCTGAGGCCT 

62 51 GAA.TCCAAAA. GAaAAACAGC CATAGGCCCT TTCAGT3GCC GGGCTACCCG 
S3 01 TGAGCCCTTC GGAGGACCAG GGCTGGGGCA GCCTCTC-3GC CCACATCCGG 

63 51 GGCCAGCTCC GGCGTGTGTT CAGTGTTA.GC AGTGGGTCAT GATGCTCTTT 
6401 CCCACCCAGC CTGGGATAGG GGCAGAGGAG GCGAGGAGGC CGTTGCCGCT 

64 51 GATGTTTGGC CGTGAACAGG TGGGTGTCTG CG?GCG?CCA CGTGCGTGTT 
6501 TTCTGACTGA CATGAAATCG ACGCCCGAGT TAGCCTCACC CGGTGACCTC 

65 51 TAGCCCTGCC CGGATGGAGC GGGGCCCACC CGGTTCAGTG TTTCTGGGGA 
6601 GCTGGACAG? GGAGTGCAAA AGGCTTGCAG AACTTGAAGC CTGCTCCTTC 
6651 CCTTGCTACC ACGGCCTCCT TTCCGTTTGA TTTGTCACTG CTTCAATCAA 
6701 TAACAGCCGC TCCAGAGTCA GTAGTCAATG AATATA.TGAC CAAATATCAC 
6751 CAGGACTGTT ACTCAATGTG TGCCGAGCCC TTGCCCATGC TGGGCTCCCG 
5801 TGTATCTGGA CACTGTAACG TGTGCTGTGT TTGCTCCCCT TCCCCTTCC? 
6851 TCTTTGCCC? TTACTTGTCT TTCTGGGGTT TTTCTGTTTG GGTTTGGTTT 
6901 GGTTTTTATT TCTCCTTTTG TGTTCCAAAC ATGAGGTTCT CTCTACTGGT 
6951 CCTCTTAACT GTGGTGTTGA GGCTTATATT TGTGTAATTT TTGGTGGGTG 
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7001 AAAGGAATTT TGCTAAGTAA ATCTCTTCTG TGTTTGAACT GAAGTCTGTA 
7051 TTGTAACTAT GTTTAAAGTA ATTGTTCCAG AGACAAATAT TTCTAGACAC 
7101 TTTTTCTTTA CAAACAAAAG CATTCGGAGG GAGGGGGATG GTGACTGAGA 
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Murine sequence of the non-coding RNA gene (including the 
putative promoter) 

I CTTAGAGTTT CGTGGCTTCG GGGTGGGAGT AGTTGGAGCA TTGGGATGTT 

51 TTTCTTACCG ACAAGCACAG TCAGGTTGAA GACCTAACCA GGGCCAGAAG 

101 TAGCTTTGCA CTTTTCTAAA CTAGGCTCCT TCAACAAGGC TTGCTGCAGA 

151 TACTACTGAC CAGACAAGCT GTTGACCAGG CACTCCCCCC AACAATATCC 

2 01 TCCCTCTTCC CCCCCCCCAC CCCCGCCCCG TG TGCTCGTT AGGGCAATTG 
251 AAAGGACACT CCCATTTTTG GTGCCATTGA TGCCCTGTCC ATAATAGCTT 

3 01 CCCTGACTTT TACACCACCC CAACTCCCAA TCTGAAGGAC TGGGAGGTGT 
■ 2 51 GATGCAGGAG AAACTATGGG ACTCTTGGGA GAAGACTATG GAGTTGGCCA 

401 GTGATTAAGG CCCACTAATT CCAACTGTGG TA GC AC AG AT CTGGCTCCAC 

501 CACCTGTCTG ATCCAGCTCT GACATGGCTA GAGGTGAGTC CTAAACTGAT 

351 GGCTTATAAA CTAGC CTG AG CCACAGAAGA GTATGGCCCA GAGTGAAGTG 

501 TCATCATCTG TTCACAAGGC ATGCTCCCCT AGAAGATAAT GCTAAAGAGG 

£51 TGC C ATGG AG GCAGCAGGAC AAAGTACAGG CAGGCTAGGT GGAGTCAAGC 

801 AGCTTAGAA.T TATTTGCACT ATTGAGTCTT CATGTCCCCA CTTCAAAACA 

851 AACAGATGCT CTGAAAGCAA A.CTGGCTTGA AATGGTGACA CTGTCCCACA 

501 AGCCACCAGA CATGGCAGTG TTCAGAACTA CCTGTATCTG TATATACCTG 

951 CGCTTGTTTT AAAGTGGGCT CAGCACATAG GATTCCCAAG AAGCTCCGAA 

1001 ACTCTAAGTG TTTGCTGCAA TTTTA.T'AAGG ACTTCCTGAT TGCTTTCTCT 

1051 CTCGTCCTTC CATTTCTTCC TTCCTTCCAT TTCATGCTTT CA.TTTCTTCC 

1101 CCTAGCTTCT AGTTGTTTCT TCTGTTCCAG GCAGCTGCAG TGCTGAACCA 

1151 CATGGTTACC TAACAGCAGT CAGCTGCAGC CCTAGGATTC TTCCTGCCCT 

12 01 TTAACTTCCC ATTGCCAGTG CCAGGTATCA TATTTAACCT TGAGCAA.GA.G 
1251 CTGGGCTCTT TTGAGCCCTC CCTAACCTCT GTGAAGAAGA ACAAGAAGGT 

13 01 AGGAAGCTCT TGCTCTTGCT AAGAAAAATG TCAAAAGGCT TTCAGACCTT 
1351 AAACAATGAG CCTTTTCACC TTTTACTCTA GAAAAGTGGA CTAGAAAATC 
1401 TGGGTCACAT TGGGTAGCTG AAGGAGATAC AGAGGCCCCT ATGGCCTGCC 
1451 AGA.GTCGTTG CATGGCCCAA CAGGGGCTCC ATGCCCACTA CCCTTGACCC 
1501 TACTCAGAAA TCTAATGTCA TACTTAGTGT GGGCAGGGGA CCTGTCAGGA 
1551 CAGATGCAGA CCTAAGCAGG GAGTGACACC AGGGCCCTTG GCCCTTCTTC 
1601 TGACAAACAT ACACATCCCA AGTCTTTTTC TAGTGGAATT CTTAACCTCT 
1S51 TGCTCACTGG GGACTGGGAA GCATCAGCAC ATCCCATATT TCAAACTCTG 
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1701 CTCCATAAGT ACAGTGGTGA ATTTTATAGA CTTGACTTTG CTGTGGGGTT 

1751 TTAATTGGTC AGTT7TAATT TGGGATCCCA AAGTTTTAAC CTCCATTCAG 

1801 GAAGTCCTTA TCTAGCTGCA TATCTTCATC ATATTGGTAT ATCCTTTTCT 

1851 GTGTTTACAG AGATGTCTCA TATCTATCGA AATCTGTCTG AGAAGTACCT 

1901 TATCAAAGTA GCAAATGAGA CAGCAGTCTT ATGCTTCCAG AAACACCCAC 

IS 51 AGGCACGTCC CATGTGAGCT GCTGCCATGA ACTGTCGAGT G TGTATTGTC 

2 001 TTGTGTATTT TCGTTAACGT TCCCCAGCTT CCTTCCTGCG GTGTAATCAT 

2 051 GGAAGAGTGA AACATCATAG AAATCGTCTA GCACTTCCTG GCCAGTCCTT 

2101 AGTGATCAGG AACCGTAGTT GACAGTTCCA ATTGATAGCT TAAGATAAAA 

2151 CCATGTTTGT CTCTTA7GGA ATGGTTAGAA CTAAGTGAGA GATCTTGCCC 

2 251 TCCTTGTGCT ATAAAAGCAA ACCCTGCAAC CAGCTTTCTG TCAGGCAGTC 
2301 CTTTTGCCTG CTCTGCTTTT GATCCTCTTA GTCTTGCTTC TGGTTCCTCC 
22 51 CTGGAGAGGG AGGAGGGGTC AGAAGAGGAA TTCTGG.-.GGA TCCAGGATAT 

2501 ATCCCTCATC TTTGGAAGAC AACCTAGGCT GAT7AGATAT TTACTTTTGG 

2 551 GATTGCAGCA CTTTGGGTGC CGTTTTTCTT TTACTTGGGT TTTATCTGCA 

27 01 GCTCCCTCAC CACCACCACC ACCCCCCAC? TACCTGTATG TAGAACTGA.? 

27 51 TTCAAAACTG C AGG TGGTGG TAACTGCAGC TTC77AGGGT TTTCTTCACT 
2301 TCTTGCT.TCT TTCCCCATTC CCTCATCCAC AAATAAGGGC ATCACAAGTC 

28 51 AGTCTCCTTT AAGCAGGCAG CTTTGGTGGG GTTTTTCCCC TGGAAGCCAG 
2 201 GGACCCTGTC AGGCTGCCTC TGCCTTGTGG TCAGGTTGAC AGGAGGTTGG 

2 951 AGGGAAAAGC CTTAAGTCAT GGGATTCTCA CCAGCTGTGT CTGGCTCAGA 

3 001 CCTGGAATGT GACCTTTATT TTGTTGTATT TGAACATTGT AAAGTGTGGG 
3 051 TGGTACCTTA AACTGAATAT GTGAAGAATC CAGAAACTGA CCAACAGCTT 
3101 TCAGATACCT GGGGCTAGGT CACTAAGGTC ACATCCAGTC TTCCCTACCC 
3151 TGTTCTAGTT GTTAGCTACT ACCTCTCCCA GATAGATTGC TGTATATCCT 
3201 CCAACTATGA TCATCCTGGC CCAAGCTTGC CTGITCTTGA GTCTGTCTTA 
3251 ACCAGTGGAA CTGCTGCCCT TGGTGTGCAG TGAGTTGAGG ACTCTTGGTC 
3301 ACAGCCAGGC TCTAGTAG TA CAGCTCCTTT CTGCTGGTGC TGTATTTCCA 
3351 TATCAAAAGG CACAGGGGAG ATCTAGAAAT GCCATCTCCC CCAGTCCATC 
3401 AGTGCCAAAC AAGCCCATGA TCCCAGCATG GGTACAGACA ACTCTG TTCA 
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3451 GTGCTATCAC AACAGACTAG AGGCCATGAA CATTGGACGT GGGAACCAGA 

3501 GCAACCCGAA TTGCTGCTGC TTTATTCAGC TTTCCG7TGC TCTGACAATG 

35 51 ATAAAACAAG GCAGTAACTT AAAACAGACT GCCAGGTTTG G C AG AGAAAG 

3 501 GAAATTCCTT AGCTGACAGC ACCTCTGGAT TTTAAATAGG TTGTAATAAG 

3 651 TGGCTCAAAC CCATCCAGGA AAAAGCAAAA GGGTTAGAAC 7GACCAGA7G 

37 01 AGACCAGCCT GAT7TCATGC AGCCCAAATG GAGTCCAGCT G7CTGAAC7C 

3 751 TGCAGCACTT CTCTACTACA GTCTCCTAGA GCATTCCAGC CAGGCTCTTC 

3 801 AGGCTGAGGA GACATCACAG GTGCCAGTTC TTCAAGAAGA CTTTTGTGCA 

3 851 TCAGTTCATA GCCTA7ATCT TTGCCCAAGA 7TG 7AG ATTC AGGTTAACAC 

3 901 TACAGATTCT AGGGCAGATG ACTGAGACTC AGAAAAAAAG CCCCTGTGGA. 

40C1 ATGCC7CATG CCAGAGCCAA GCCCTCTGCT CCATCCACAT CC77TTC7GG 

4 051 C7CCTTCTTC CTGCTCTCTG CTTCAGTGAA CCAGCCCCAC TCTGAAGAGA 
4101 TTTGTTGATT CTCTCCA77T T7A7GTC777 C7C77TTAGG TACTATATAG 

43 CI TTAATGTTTA TGAA7AAGAG GAGGCTTTTG AAAAAATGTT GA.TCTATAAA 

43 51 7AC77ACTTT AGGCCTGAGG TGTCTAATGA G7GAAC7GAG CAATGGGAAC 

4401 7CAAGGCTGA AGCC7CCTGC ATCAGAGGAG GTAGAACCAG GAGCCTCTTG 

4501 CAGAAACTAC TTCTGACCTT GTCATTTGGA ATGGAGGTTA G7GG7CTGCC 

45 51 AGA7GCCAAA GCTGCA7GAG ACCAGC7C77 GG777A7CAA 77TGAACACT 

4501 CAG7AACC7A GAAGGCCCAG CACAAAGTG7 C7GCTC7C7T C77AAC7GAG 

4651 CC7GCCCCAG CACTAC7GCA CAAA7TAGGG AGGGTC7AC7 7C C 7 AC AG AG 

47 01 CATCCCTCCC TGGGCCCCCT CCCATCC77T GTA.CTCTACC TACCTGACCT 

4751 TCAGGATCTT GGCACA7ACG AAATGGC7GT G7AGCAAGCA CTT7GGCATG 

4801 CCC7CCTAAA CTTACCCCAG AGCCTCTCCC 7GCC7CG77A AGCCAG7CTG 

4851 CCTGTCTTC7 GGGGAGG7GT 7AGAGCCCA7 AGAATGGAGA GGAGAAAGAA 

4901 AAGAGGAAGA GGCAGGCAGG TAGTAAAAAG GCTCTGGGAG GAAAGACAGC 

4951 C7CC7AGGCT TTGCACAAGC AGGACTCAGC CCC7TG7GGG AAC7AAGTGC 

5001 CA7C7TGGAG TTTAAGAACA TTTGGACAAG 7TGCAAATGA CC777GCTCC 

5051 7TGC7CCTCT CACC7777A7 GGGGCCCTGC TTAGCACTGA AAGCAAATGC 

5101 GCTGAAAAGG CAAAGAGGT7 TGGCTCCTGC CCAC7GA7AG TCC7T7CCC7 

5151 GCAGTGTTTG TGTGTCAAGT GGCAAAGC7G TTCTTCCTGG TGACTCTGAT 

5201 TAGATCCAGT AACTTAAGAG ATTTGTATGC ATAGGTCTGC TTTGACTCTT 

Fig. 2 (cont'd 2) 
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5251 CTATTCTGGG CTTTTGATTT GTTTTTCAGT TTTGCTTTTA GTTTTCCTAT 

5301 TTTTATTTTA TGCACCAACT AGACACACAA AGCAGTTGAA TTTATATATA 

53 51 TATATATATA TATATATCTG TATATTTCAC AATTATAAAC TCATTTTGCT 

5401 TGTGACGCCA CACACACACA AAAAGAAAAA CCTTTTAAAA TTATACCTG? 

5451 TGCTTAATTA CAATATTTCT GATAACCATA GAGTAGGACA AGGGAAAAAA 

5501 TTTAAAAAAA AAAAAAAAAA AAGAAAAAAC ACATCTGTCT GCTGC-TCACT 

55 51 TCTTCAATCC AAGCAGATCT GTGATCTTTC CTCGCGTCTT TCAAAGACTT 

5601 CCCTGTGCTA AGTGAAGGAA GCTCCAGGCT GCACCCAGGT TTTGTGCTTT 

5 651 GTTTCTCCTC TGTTGTGAAA GGGGCCC ZAA GATTCTGGGT ACAGGACAGT 

57 01 TCATTTCAGC ATGGGG 7C AG GAGACAAGAG CACTCCCTT? ACATGCTGAC 

5751 GTACAGAACT TAGTGGGAAT AGCCTAGTCC CCACCTCT AG GGATGGGGAG 

5301 CTAGCATGCA TGGGGG 7GAC CCAACTCCCT CCACCTTTCC CTGGCCAGGA 

5901 AGAGCATTTA AAAACCCTCC AAACTTTGCT GAGTCTAGGG ACTAGAGAGA 

SO01 ATGCCAGGGA CCCCAGAACC ACATCCAACA GCCCAATGGG TCTCCTCCAG 
6051 AAAG TAGTG A AGACTCCAGA AACATCCCTT TCTCTTCTCC CTGCTCCCAT 

5151 AAAAAAAATT AGCTGTAACA GTTCTTTTTG CAAAAGGATC ATTCTTAAAT 

62 01 AATTAAAAAC ACCCCCCCCC CAAAAAAAAG TCCAGAACCT TGTTCTTCCA 
5251 AAGCAGAGAG CATTATAATC AGGGCCAAAA TCTGTCCCAC ACCTCTACCC 

63 01 CATCTCCTCA TGP-JTTGCTGC TTCTAAGGCC AGAATACAGC AAAGATATTT 

63 51 GTAGGCCCTT TGGGTGACTG GGCTACCCTT GGAGCTCTTG GAAGATGGGC 
6401 TGGGGAAGCC TCTGAGACCC TATCCTAGGG CCTTGCTCTA GGGAGTAATC 

64 51 AGTATTAGTA GAGTGTCACA ACATTATTCC CCAGCCGGCA TG AG ATGGGG 

65 01 GCAGAAGAAG CCAAAGGGTT GTCTCCACTG CTACTTACTT GGCCACTGAC 
6551 AGGTAGGTGA CCATGTATGT CCATATGCAT GTTTTATSGC TGATGTGAGA 
6601 TCAGCACCCA AGTTAGCTTC ACCTGGTGAC CTCTAACCCT GCCTGGATGG 
6651 AGCAGGCCAC CTGGTTCAAT GTTTCTGGGC AGCTGGACAA TGGAGTGCAA 
6701 AAGGCTTACA GAACTTGAAG CCTTTTCCTT ACTTTGCTAG CACGGCCTCC 
6751 TTTTCCATTT GATTTGTCAC TGCTTCAGTC AATAACAGCC GCTCCAGAGT 
6801 CAGTAGTTGA TGAATATATG ACCAAATATC ACCAGGACTG TTACTCAACG 
5851 TGTGCCGAGC CCTTTCCTTG TGCTGGGCTC CCTGTGTACC TGGACACTGT 
6901 AATGTGTGCT GTGTTTGCTC TCCTTCCTCT TCCTTCCTTG CCCTTTCCTT 
6951 GTCTTTCTGG GGTTTTTCTG TTGGGTTTGG TTTGGTTTTA TTTTTCCTTT 

Fig. 2 (cont'd 3) 
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7001 TGTGTTCCAA ACATGAGGTT TTCTCTACTG GTCCTCT7TA ACTGTGGTGT 

7051 TGAGGCTTCT ATTTGTGTAA TTTTTGGTGG GTGAAAGGAA CTTTGCTAAG 

7101 TAAATCTCTT CTGTGTTTGA AATGAAGTCT GTATTGTAAC TATGTTTAAA 

7151 GTAATTGTTC CAGAGACAAA TGCTTCTAGG TACATTTTCA TTACAAACAA 

72 01 AGCATTTGAA GGGAGGGAAG TGGTGAATAA GACAAGAGGG GCAATCTGAA 

7251 TTGATCCCTG CCCAGATCAG C C AGAAGCTA CCAAAAG TT A AGCAC7GGTT 

7301 TTCCATTCCA AGTCAAGAGA CTGAAGCTGA TGTTTTGCCA TTTTCAAAGT 

7351 CAAAGCAAAA CCAGCTTTTC CACCCAATGG ATTCTTTGCT TCTCCTTCCC 

7 401 AGATTATTAC TACTGCTGTA ATAATCTAGG AGTGCCAGGA GGGAAAGGAG 

7451 TATTAACACA GAGCTGTGCT CACTGAGTAT GGAAAGGCTT GGTCTGAGTT 

7501 TTCAGGAGGA TG AC C C AC TG TGGACAT3GG GAGAAGACAG AAG.ATAAATT 

7601 CATTATTGTC TACAAGGCAT GTTTCAAAGA CATGACCAGT CAGGACACTT 
7651 CTGTCATACT CCATGTTGCC CCCCAGTACA CAGTACTAA? CTGA.TATCTC 
77 01 TGTTCCCGCC ATGCCTGGGG GATAAAATGA TAGCAGAGAC TCCTTTCCTT 

7851 GCCTAATGTA AGAGGAACAG AGCAGTGTTC CCTTGG?-.GCC TCATGTGGAC 

7901 AGTTCTACCT GTAGTGACCA GTTGGCTATA G TAG TTATTA GCTGGAACAA 

7 951 C CAG ACAGGG TACATGCCCC CTCCAAAATC CATGTTC-TAC TCCCCTCTGC 

3001 CAGCCAGGGG GGGTGAGATC TGTAGAATAG TGCAGCCAGT GACAAGCCAC 

3051 CTTGTGTTTG TCACCAGCTC AAAAACTCAT CTAAGG TTGG GAGCAGGCAG 

8101 ACAAGGCAGA GAGAAAGATC CAGGACAGAC CTAGCTGGGC TGGAGGGGTC 

615 1 TTGAAAAGCC CTCTGTCGTA TTCACCTTCA GTTTTTGTGC TTTGGGACAA 

82 51 TTTTGTAGTG TTCAAAACAA AAGGTTCTTT GTGTATAGCC AAATGACTGA 

83 01 AAGCACTGAT ATATTTAAAA ACAAAAGGCA. ATTTATTAAG GAAATTTGTA 

S3 51 CCATTTCAGT AAACCTGTCT GAATG7ACCT GTATACGTTT CAAAAACACA 

8401 CCCCACTGAA CCCCTGTAAC CTATTTATTA TATAAAGAGT TTGCCTTATA 

8451 AATTTACATA AAAA 
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iCTTCAGGGTGGGAGTAGTTGGAGCATTGGGGATGT 



51 TTTTCTTACCGACAAGCACAGTCAGGTTGAAGACCTAACCAGGGCCAGAA 
50 

SCACTTTTCTAAACTAGGCTCCTTCAACAAGGCTTGCTGCAG 

51 ATACTACTGACCAGACAAGCTt 



100 



150 - 



— TC CAACAATATC 



232 GCGACAGAGCAGTTGAGAGGACACTCCCGTTTTCGGTGCCATCAGTGCCC 



47 6 TTCACAGGACAGGAAAGTGGCACCTGTCTGCTCCAGCTCTGGCATGGCTA 
S26 GGAGGGGGGAGTCCCTTGAACTACTGG.GTGTAGACTGGCCTGAACCACA 



575 GGAGAGGATGGCCCAGGGTGAGGTGGCATGGTCCATTCTCAAGGGACG . T 



3 7 2 0 GGGATTCCTAATCACTCAGAGCAGTCTGTGACTTAGTGGACAGGGGAGGG 

rt! 724 ACTAG— A- 

: ii ; 770 GGCAAAGGGGGAGGAGAAGAAAATGTTCTTCCAGrrACTTTCCAATTCTC 



! 820 CrrTAGGGACAGCTTAGAATTATTTGCACTATTGAGTCTTCATGTTCCCA 
870 CITCAAAACAAACAGATGCTCTGAGAGCAAACTGGCTTGAATTGGTGACA 



; - ^ 970 TGTATATATACCTGCGCTTGTTTTAAAGTGGGCTCAGCACATAGGGTTCC 
5 i.020 CACGAAGCTCCGAAACTCTAAGTGTTTGCTG CAATTTTATAAGGACTTCC 



) CTTTCACTTCTTTCCC 



1170 CTCTTCCAGGCAGCCGCGGTGCCCAACC ACACTTGTC 



— ACATGGTTACCTA GCA— 



3 . TGAAGAGCAAGAGAGCAGCAAGGTCTTGCTCT 



5 CCTAGGTAGCCCCCTCTTCCCTGGTAAGAAAAA. . GCAAAAGGCATTTCC 
' — C TGT . A 

! CACCCTGAACAACGAGCCTTTTCACCCTTCTACTCTAGAGAAGTGGACre 



1394 - 
1504 C 
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21 



1997 TCTTATGCTTCCAGAAACACCCACAGGCATGTCCCATGTGAGCTGCTGCC 

2047 ATGAACTGTCAAGTGTGTGTTGTCTTGTGTATTTCAGTTATTG . TCCCTG 
1977 G A TC AC-T CA 

2096 GCTTCCTTACTATGGTGTAATCATGAAGGAGTGAAACATCATAGAAACTG 
2027 C--GC G-A Tc _ 

214 6 TCTAGCACTTCCTTGCCAGTCTTTAGTGATCAGGAACCATAGTTGACAGT 



. .AAGTGAGGGAGTTTGCCCCG 



3TAGAGTCTCATAGTT 
-CC~. .A 

2 GGACTTTCTAGC^TATATCTGTCCATTTCCTTATGCTGTAAAAGCAAGTC 



2 GGGAAGGGGGGTCAGAAGAG . . 



AGGGTGAGTCCTCC 

--GAATTCTGGAGGATCC A-AT T- 



2 617 TTTTATGTATTA. . . 



— TGGG-ATCCC— 



I AGGAGTGCGTGGCCCACCAGGCCC 



2954 CTCGTCTTTCCCAAA GGCATCACGAGTCAGTCGCCTTTC 



CAGGAGGTTGGAGGG . AAAGCCT 



8 TAAGCTGC^GGATTCTCACCAGCTGTGTCCGGCCCAGTTTrGGGGTCTGA 



■AG . AATTGACCGACAGCTTTCCAG 

6 C AA-C A G-ATC A— C A AGA 

3 TACCCATGGGGCTAGGTCATTAAGGCCACATCCACAGTCTCCCCCACCCT 



5 — T G-T-T T TC G 

0 TCCATCAGTGCQAACTAGCCAACGGCCCCAGCTTCTCAGCTCGCTGGAT 

5 A c-T-AT A- 

0 GGCGGAAGCTGCTACTCGTGAGCGCCAGTGCGGGTGCAGACAATCTTCTG 

0 A ..C 

0 TTGGGTGGCATCATTCCAGGCCCGAAG.CATGAACAGTGCACCTGGGACA 
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3 GGGTGGAGAAAGGAGTTTCTTTAGCTGACAGAATCTCTGAATTrTAAATC 



W/72C2. 

-^ATATATATATOTGTATATISCACAATTATAAACTC 

J ATTTTGCTTGTGGCTCCACACACACAAAAAAAG ACCTGTTAAAATT 

5 ATACCTGTTGOTTAATTACAATATTTCTGATAACCATAGCATAGGACAAG 



5343 TATATATATATATA— 



5 TGGTCACTTCTTCTGTCCAAGCAGATTCGTGGTCTTTTCCTCGCTTCl'TT 
> CAAGGGCTTTCCTGTGCCAGGTGAAGGAGGCTCCAGGCAGCACCCAGGTT 



5 TTGCACTCTTGTTTCTCCCGTGCTTGTGAAAGAGGTCCCAAGGTTCTGGG 



— GACAGTTCATTTCAGCATGGGGTCAGGAGACAA- - 



. -GAGCGCTCCCTT 



4214 GGGCAGAGCC. . CTGCTCCCTCC . . . 
4014 -A— CA CT A— -ACATCCTTTTCT 



. . GGGTCTTCCTACTCT 



5835 GCCT. . CCCTGCCAGTAGGG . TCCGAGTGTGTTTCATCCTTCC . CACTCT 

5878 — T-TC— A-TT--C C— A— CA-T-AAA-A— C AA— T- 

5881 GTCGAGCCTGGGGGCTGGAGCG3AGACGGGAGGCCTGGCCTGTCTCGGA. 



4331 GGCATAGTCTACTTtnTATAAATCGTTAGGATACTGCCTCCCCCAGGGTC 



4431 CAATTTAGAAGGAAAACCTAGAAAACATTKKCAGAAAATTACATTTCGA 



4481 TGTTTTTGAATGAATACAAGCAAGCTTTTACAACAGTGCTGATCTAAAAA 
4305 AG— G-G GA — A-A — T T 



S028 ? 
6029 C 



) TCAGTCCAAGGGGTCCCCTCCAG.GAGTAGTGAAGACTCCAGAAATGTCC 



CTAATAATTAGAAACACC . . 



— CCCCCCCCAA 

3 AAATCCAGAAACTTGTTCTTCCAAAGCAGAGAGCATTATAATCACCAGGG 



&.-AGTGT— G-. . 

i ,4701 ATCTTGACAGAGCCTGCCCCAGCGCTGACG 
■■ 4639 T A— T A — — TGCACA 

4751 AACTAGGAAGGCACTTCTGCCTGAGGGGCAGCCTGCCrr. .GCCCACTCC 



i CATCCTTTG-A A-CTA GACCTTCAGG ATCTTGGCACATA — A- 



6325 t 
6294 C 



— . .-C-GC G-T-TTTG— 



— T-- GAG-TC-T A— . . T A A-A — C 

1 CATCC. -GGGGCCAGCTCC^CGTGTGTTCAGTGTTAGCAGTGGGTCATG 



2 GTTGCCGCTG. . 
L — CT — A CTACT-AC ACTG A A-CAT — AT— 

3 CCACGTGCGTGTTTTCTGACTCACATGAAATCGACGCCCGAGTTAGCCTC 



3 GAGCCCATAGAATGGAGAGGAGAAA-.S 



. .GTCAGGAGGCAGAAGGAAGCAGGTG 



6669 i 
6638 t 



J GTCTTTCTGGGGAGCTGGACAGTGGAGTGCAAAAGGCTTGCAGAACTTGA 



r . TTTCCGTTTGATTTGTC 



6687 ACTGCTTCAATCAATAACAGCCGCTCCAGAGTCAGTAGTCAATGAATATA 



4995 TTTGAATTCAAGAACATTTGGGGAATTTGGAAATCTCTTTGCCCCCAAAC 
5005 — G — G — T AC — G C GA-C — TG-T — TTG- 



;-C A .T T- 

XGCAGTGTTTGTCTGTCAAGTGGCAAAGCTGTTCTTCCTGGTGACCC 



rTAGTTTTGCTTTTAGTTTT 



5292 TCTGTC CCTTTTATTTAACGCACCGACTAGACACACAAAG CAGTTGAATT 



6919 — T — 
6885 TGTTT 



r . GTGTATCTGGACACTGTAACGTG'rGCTGTGTTTGC 
CTCCTTCTTTGCCCTTTAC 



rTTGGTGGGTGAAAGGAATTTTG^AAGTAAATCTCTTCTGTGT 
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--T-C-A ...GT- 



:CACTGAATCCCTGTAACC 



Fig. 3 (3) 

dashed line: putative promoter 

full line: sequence-conserved high-energy sequence 
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human 
makak 



■ - CCCCCATGTGGTCGT 



-. C . . -AATA. . . -TC A c " 

TC CAA-AATATCCT-CC-CTTCCCCCCCCCCACCCCCG . G c— 

TC ACAA-AA-A-CC— C-CCCTCCTCACCCCACCCCTAT TG C— a" 

— TTT-TAGGGTA-A — AGC GC — T TCATC-C- 



huinar. 
makak 



human 
schim 

makak 

mouse 
rat 



TAGAGACAGAGCGACAGAGCAGTTGAGAGGACACTCCCGTTTTCGGTGCCATCAGTGCCCCGTCTACA. 



T TGA — 

— T CTGA — 



GCTCCCCCAGCTCCCCCC . -ACCTCCCCC 



- .T C-T-. 

— T C-T-ATA-- 

-C-T-. 



-TGA TT-- 

— TGA— TTTA- 
— TGA — TTTA- 



-TAA-AGAGTAG-G-TAGTGG-AG-TTA-ATTTT-AGTG— 
201 

ACTCCCAACCACX3TT . GGGACAGGGAGG TCTGAGGCAGGAGAGACAGTT . .GGATTCTTTAGAGAAGA. . 



T-TGA. . . C C A-A-T-T-G 

T-TGA. . .A T T A— TA-G 

T-TGA. . . T C A— TA-ATA- 

-T-AA— TT-TA— CCAA-GTCTTA-AT-A-T-T-TT-AG-G-TTTT— 



- TGGATATGACCAGTGGCTATGGCCTGTGC 



CTA _ . — GG A— G CA.-T 

-C GG CTA GT--G AT— A CA. CT 

-C GG CTA-A— GT— G-A AT-GC-C — CA.-T 

CCCT GG-GCC . G — GGG — GGGG-A— G- . — ATTA- 



orang 
"r- akak 
j~|unst 



Ifiiman 
"'d-him 
p||.a_ig 

Ihamst 
.mouse 
rat 

; hiaman 
Lakak 



GATCCCACCCGTGGTGGCTC^GTCTGGCCCCAttCCAGCCCCAATCaU^ 
A— TAG-A-T A -T-CA-^ 



AT— TTAGGAAA-A— G-TG— A-A-A-AG— G-G— CTGAGC-GTTGGC— A. . GA-C-TGACTAGGG-CC-G T- . A---AA- — 

401 

AGCTCTGGCATGGCTAGGAGGGGGGAGTCCCTTGAACTACTGGGT . GTAGACTGGCCTGAACCACAGGAGAGGATGGCCCAGGGTGAGGTGGCATGGTCC 



. — . -A GA C-TA — A A — 



— A T A — 



T — A-G T-A-T. . 

A AG-T-A-T. . 

AGAT-A-T.. ________ _ „ _ -__--„-__. _- — _ 

CAAGG CCA— T-A-T A--AGGG-GGG--AAGAC-T-A-A-AAGGA-TAG. . AA-C A— T— CC-A-A-AA-AGC T. 

501 

ATTCTCAAGGGACG . TCCTCCAACGGGTGGCGCTAGA GGCCATGGAGGCAGTAGGACAAGGTGCAGGCAGGCTGGCC-GGGGT^ 



C-T-C-T-A-T-GA-AA-AATT A-GAGG.T C CA A— 

C-T-C— C-T-GAA-A-AAT A-GAGG.T C A— A 

C-T-C C-T-GAA-T-CAT A-GAAG.T C A A 

■ -AACC-TAC — A GGA-T — A-TTG-A-GAGGCCC-T A-TCCCC-ACCACCAA-A— 



— A-GTG-A- . A A CT 

. -A-GTG-A- . A A CT 

. . AT-T — A-C-GCA — T. . — TT 



human 
-iiakak 



AGCACACJCGGGGTGAGAGGGATTCCTAATCACTCAGAGCAGTCTGT^ . 



.TAGTGGACAGGG-AGGGGGCAAAGGGGGAGGAGAAG 



human 



hvunan 
schim 
orang 



.-G-G C— 

~ CGT-.. 

— TG-CA-A-AACA A-CAAT — G . TG . . . — T TA-G . . -T TG— AC-A— GC-T A— A-T- 

-TG-CA-A-AACA— -A-CAGT- . . .G . . . — T-A-TAAGA . ____A-A-G- 

— TG-CA-A-A-CA A-CAGT — C . TG ... — T TAAG- A— A-GA 

CATTT T — ACCTT-T-TATA — TGGGTGTG-ATGCAC-TAGATA '. A-TGA— A-GA 

701 



AAAATGTTCTTCCAGTTACTTTCCAATTCT . . 



.CCTTTAGGGAC_AGCTTAGAATTATTTCCACTATTGAGTCTTCAT. . 



. GTTCCCACI^C-AAAACAAA 



CT G-G— ACTAGC G C-G-C A 1 

. . TCT GAGAGCAAACTGGCTTGAATTGGTGACATTTAGTCCCT . . 



!A TCCCA 

_ TGACAGTGTTGAGAACTACCTGGATTr 



kanga -T — CTGAGATG-TCA-C- - 



human 

orang 
makak 

mouse 
rat 



Fig. 5 
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Partial sequence of the non-coding RNA gene from hamster 

1 TTGCTGCAGA TACTACTGAC CAGACAAGCT GTTGACCAGG CACCCCCCCA 

51 ATACTCCCCC AATGTGCTCA TTAGAGATAG CAGTTGAGAG GACACTCCCA 

101 TTTTTGGTGC CCTGTCCATA GCTTCCCTGA CTCTTCCACC ACCCCAACTC 

151 CCAATCTGAG GGACCGGGAG GTGCGAGGCA GGAAAAATAT TGGATTCTTT 

201 AGAGAAGACT AGAGGTGACC AGTGACTGTG GCCCAGTAAT TAGAACTGTG 

251 GTGGCACAAG TCTGGCCCCA CATCCACCCA ATCCAAAACT GATAAGGATA 

3 01 TTTTGAAAAA CAGGAAAGCA GTACCTGTCT GATCCAGCTC TGGTATAGGT 

351 AGGAGTGAGT CCTGAACTGC TGGATTACAG ACTGGCTTGA GCCACAGAAG 

401 ATGATGGACC AGAGTAAAGT ATCATCACCT GCTCACAAGG CATGCTTCAC 

451 TAGAGAATAA TTCTAAAGAG GTGCCATGGA GGCAGCAGGA CAAGGCACAA 

501 GCAGTCTGGG TGGGGGTCAA GCCAGACCTA GTGCCACAGA ACAAGAGAGC 

551 AATCTGTGAC TAGTAGTTAG GGACTTTGTG GATGGGACAA GGGGCATGGG 

601 GGAAGAAATG AAAATATTCT TCCAATTACT TTCCAGTTCT CCTTTAGGGA 

651 CAGCTTAGAA TTATTTG C AC TATTGAGTCT TCATGTTCCC ACTTAAAAAC 

701 AAACAGATGC TCTGAAAGCA AACTGGCTTG AAATGGTGAC ACTTTGTCCC 

751 ACAAGCCACC AAATGTGGCA GTGTTTAGAA CTACCTGGAT CTGTATATAC 

801 CTG 



Fig. 5a 
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Partial sequence of the non-coding RNA gene from kangaroo 



1 


TTGCTGCATA 


TACTACTGAC 


CAGACAAGCT 




CTTTTTAGGG 


51 


TACACCAGCA 


CCTGCCCTCC 


ATTCATCCCT 


v? 1 1 oooiVjAVj 


GGATGGTGTA 


101 


CTGGTTGTCA 


CTAGAGACCT 


AACAGAGTAG 


GGTTAGTGGG 


AGCTTACATT 


151 


TTCAGTGCCA 


TTAACATTCT 


AGTCCAAGGT 


CTTAAATTAT 


TATGTTGAGG 


201 


GGTTTTTTTT 


CCCCTGAGGG 


GGCCGGGGGG 


TGGGGGGAGG 


GTTGATTAGA 


251 


TTCCTTAGGA 


AAGAGGGTTG 


AGACAGACAG 


CAGAGCACTG 


AGCAGTTGGC 


301 


ACTAAAGGAG 


ACCTTGACTA 


GGGGCCAGGT 


GGCATCATCT 


AATCCCAAGG 


351 


GGCTCCAAGT 


GAGTATTAGG 


GTGGGGGAAG ACATTATAGA 


AGGAATAGAA 


401 


ACAGGATAGC 


TCAGCCTAAA 


GAAGAGCGGT 


TAAAACCCTA 


CCCACCAGGA 


451 


GTTGACTTGA AAGAGGCCCC 


TATGGAGGAA 


TCCCCAACCA 


CCAAAAGCAA 


501 


TCTTGAGCTG 


CAGCTGCTTC 


ATTTAGTGGA 


CCTTGTGTAT 


ATCTGGGTGT 


551 


GTATGCACAT 


AGATAGACAG 


TGAGAAAGAA 


AACTGTTCTT 


CCAGTTCTTT 


601 


TCCAGTGCTA 


CTAGC TTAGG 


GACAGGTTAG 


AACTGTCTGC 


ACAATTGTGT 


651 


GATCATTCCC 


ATTCC CACTT 


CAAAACAAAC 


TGACTGAGAT 


GTTCAACAGA 


701 


AAACTGGCTT 


CAATGGGTAA 


CATGCCCTTG 


CCACTTACTT 


AAGACACTGG 


751 


TGTGATGGGG 


TTTTGAACTC 


CCTATATTTG 


TAGGTATCTG 





Fig. 5b 
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:ial sequence of the non-coding RNA gene from makaka 



1 


TTGCTGCAGA 


TACTACTGAC 


CAGACAAGCT 


GTTGACCAGG 


CACCTCCCCT 


51 


CCCGCCCAAA 


CCTTTCCCCC 


ATGTGGTCGT 


TAGAGACAGA 


GCAGTTGAGA 


101 


GGACACTCCC 


GTTTTCGGTG 


CCATCAGTGC 


CCCGTCTACC 


ACTCCCCCAG 


151 


CTCCCCCCAC 


CTCCCCCACT 


CCCAACCACG 


TTGGGACAGG 


GAGGTGTGAG 


201 


GCAGGAGAGA 


CAGTTGGATT 


CTTTAGAGAT 


GGATGTGACC 


AGTGGCTATG 


251 


GCCCGTGCGA 


TCCCACCCGT 


GGCGGCTCAA 


ATCTGGCCCC 


ACCCCAGCCC 


301 


CAATCCAAAA 


CTGGCAAGGA 


CGCTTCACAG GACAGGAAAG TGGCACCTGT 


351 


CTGTTCCGGC 


ATGGCTAGGA 


GGGAGTTGTC 


CCTTGAACTA 


CTGGGTGTAG 


401 


ACTGGCCTAA 


ATCACAGGAG 


AGGATGGCCC 


AGGGTGAGGT 


GGCATGGTCC 


451 


ATTCTCAAGG 


GACGTCCTCC 


AGTTGGTGGC 


ACTAGAGAGG 


CCATGGAGGC 


501 


AGTAGGACAA 


GGCACAGGCA 


GGCTGGCCCA 


GGGTCAGGCC 


GGGCCGAACA 


551 


CAGCGGGGTG 


AGAGGGATTC 


CTCGTCTCAG AGCAGTCTGT 


GACCGGTAGT 


601 


TAGGGACTTA 


GTGGACAGGG 


AAGGGGCAAA 


GGGGGAGGAG 


AAGAAAATGT 


651 


TCTTCCAGTT 


ACTTTCCAAT 


TCTACTCCTT 


TAGGGACAGC 


TTAGAATTAT 


701 


TTGCACTATT 


GAGTCTTCAT 


GTTCCCACTT 


CAAAACAAAC 


AGATGCTCTG 


751 


AGAGCAAACT 


GGCTTGAATT 


GGTGACGTTT 


AGTCCCTCAG 


GCCACCAGAT 


801 


GTGATGGTGT 


TGAGAACTAC 


CTGGATATGT 


ATATATACCT 


G 



Fig. 5c 
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Partial sequence of the non-coding RNA gene from orangutan 



1 


TTGCTGCAGA 


TACTACTGAC 


CAGACAAGCT 


GTTGACCAGG 


CACCTCCCCT 


51 


CCCGCCCAAA 


CCTTTCCCCC 


ATGTGGTCGT 


TAGAGACAGA 


GCAGTTGAGA 


101 


GGACACTCCC 


GTTTTCGGTG 


CCATCAGTGC 


CCCGTCTGCA 


GCTCCCCCAG 


151 


CTCCCCCCAC 


CTCCCCCACT 


CCCAACCACG 


TTGGGACAGG 


GAGGTGTGAG 


201 


GCAGGAGAGA 


CAGTTGGATT 


CTTTCGAGAA 


GATGGATATG 


ACCAGTGGCC 


251 


ATGGCCTGTG 


CGATCCCACC 


CGTGGCGGCT 


CAAGTCTGGC 


CCCACACCAG 


301 


CCCCAATCCA 


AAACTGGCAA 


GGACGCTTCA 


CAGGACAGGA 


AAGTGGCACC 


351 


TGTCTGCTCC 


AGCTCTGGCA 


TGGCTAGGAG 


GGAGTCGTCC 


CTTGAACTAC 


401 


TGGGTGTAGA 


CTGGCCTGAA 


CCACAGGAGA 


GGATGGCCCA 


GGGTGAGGTG 


451 


GCATGGTCCA 


TTCTCAAGGG 


ACGTCCTCCA 


ACGGGTGGCG 


CTAGAAAGGC 


501 


CATGGAGGCA 


GTAGGACAAG 


GCGCAGGCAG 


GCTGGCCCGG 


GGTCAGGCCG 




GGCAGGGCAC 


AGCGGGGTGA 


GAGGGATTCC 


TAATCACTCA 


GAGCAGTGTG 


601 


TGACTGGTAG 


TTAGGGACTC 


AGTGGACAGG 


GGAGGGGCGA 


GGGGGCAGGA 


651 


GAAGAAAATG 


TTCTTCCAGT 


TACTTTCCAA 


TTCTCCTTTA 


GGGACAGCTT 


701 


AGAATTATTT 


GCACTATTGA 


GTCTTCATGT 


TCCCACTTCA 


AAACAAACGA 


751 


TGCTCTGAGA 


GCAAACTGGC 


TTGAATTGGT 


GACATTTAGT 


CCCTCAAGCC 


801 


AC C AG ATGTG 


AGTGTTGAGA 


ACTACCTGGA 


TTTGTATATA 


TACCTG 



Fig. 5d 
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Partial sequence of the non-coding RNA gene from rat 

1 TTGCTGCAGA TACTACTGAC CAGACAAGCT GTTGACCAGG CACTCCCCAC 

51 AACAACAACC CCCTCCCTCC TCACCCCACC CCTATCCCCT GTGTGCTCAT 

101 TAGAGAGGGC AATTGAGAGG ACACTCCCAT TTTTGGTGCC ACTGATGCCC 

151 TGTCCATAGC TTCCCTGACT TTTACACCAC CCCAACTCCC AATCTGAGGG 

2 01 ACTGGGAGGT GTGACGCAGG AGAAACTATA TAGGACTCTT GGGAGAAGAC 
251 TATAGAGTTG GCAAGTGATT GCGCCCCAGT AATTCCAACT GTGGTAGCAC 
301 AAGTCTGGCT CCACACCAAC CCAATCCAAA ACTGACAAGG ACATTTTGCA 

3 51 AAAAATGAAA GTGGCATTTG TCTGATCCAG CTCTGGCATG GCTAGAGATG 
401 AGTCTTAAAC TGTTGGCTTA TAAACTGGCC TGAGCAACAG AAGAGGATGG 
451 CCCAGAGTAA AGTGTCATCA TCTGTTCACA AGGCATGCTC CCCTAGAAGT 
501 TCATGCTAAA GAAGTGCCAT GGAGGCAGCA GGACAAAGTA CAGGCTAGGT 
551 GGAGTCAAGC CAGGCCTAGT GCCACAGAGC AAGAGAGCAG TCTCTGACTA 
601 GTAGTTAAGG GGGAAGAAAG AAAAATATTC TTCCAATTGC TTTCCAGTTC 
651 TCCTTTAGGG ACAGCTTAGA ATTATTTGCA CTATTGAGTC TTCATGTTCC 
701 CACTTCAAAA CAAATAGATG CTCTGAAAGC AAACTGGCTT GAAATGGTGA 
751 CACTGTCCCA CAAGCCACCA GACAATGGCA GTGTTCAGAA CTACCTGTAT 
801 ATGTATATAC CTG 



Fig. 5e 
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Partial sequence of the non-coding RNA gene from chimpanzee 



1 


TTGCTGCAGA 


TACTACTGAC 


CAGACAAGCT 


GTTGACCAGG 


CACCTCCCCT 


51 


CCCGCCCAAA 


CCTTTCCCCC 


ATGTGGTCGT 


TAGAGACAGA 


GCGACAGAGC 


101 


AGTTGAGAGG 


ACACTCCCGT 


TTTCGGTGCC 


ATCAGTGCCC 


CGTCTACAGC 


151 


TCCCCCAGCT 


CCCCCCACCT 


CCCCCACTCC 


CAACCACGTT 


GGGACAGGGA 


201 


GGTGTGAGGC 


AGGAGAGACA 


GTTGGATTCT 


TTAGAGAAGA 


TGGATATGAC 


251 


CAGTGGCTAT 


GGCCTGTGTG 


ATCCCACCCG 


TGGTGGCTCA 


AGTCTGGCCC 


301 


CACACCAGCC 


CCAATCCAAA 


ACTGGCAAGG 


ACGCTTCACA 


GGACAGGAAA 


351 


GTGGCACCTG 


TCTGCTCCAG 


CTCTGGCATG 


GCTAGGAGGG 


GGGAGTCCCT 


401 


TGAACTACTG 


GGTGTAGACT 


GGCCTGAACC 


ACAGGAGAGG 


ATGGCCCAGG 


451 


GTGAGGTGGC 


GTGGTCCATT 


CTCAAGGGAC 


GTCCTCCAAC 


GGGTGGCGCT 


501 


AGAGGCCATG 


GAGGCAGTAG 


GACAAGGCGC 


AGGCAGGCTG 


GCCCGGGGTC 


551 


AGGCCGGGCA 


GAGCACAGCG 


GGGTGAGAGG GATTCCTAAT 


CACTCAGAGC 


601 


AGTCTGTGAC 


TTAGTGGACA 


GGGGAGGGGG 


CAAAGGGGGA 


GGAGAAGAAA 


651 


ATGTTCTTCC 


AGTTACTTTC 


CAATTCTCCT 


TTAGGGACAG 


CTTAGAATTA 


701 


TTTGCACTAT 


TGAGTCTTCA 


TGTTCCCACT 


TCAAAACAAA 


CAGATGCTCT 


751 


GAGAGCAAAC 


TGGCTTGAAT 


TGGTGACATT 


TAGTCCCTCA 


AGCCACCAGA 


801 


TGTGACAGTG 


TTGAGAACTA 


CCTGGATTTG 


TATATATACC 


TG 



Fig. 5f 
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SEQUENCE LISTING 



Poustka, Annemarie 
Coy, Johannes 



20> Modularly Constructed RNA Molecules Having Two Sequence Region Types 



<130> 012627-019 












<140> US 09/720,215 
<141> 2000-12-22 












<150> PCT/DE99/01867 
<151> 1999-06-25 












<150> DE 1 
<151> 1998 


98 28 624.4 
-06-26 












<160> 8 














<170> Patentln version 3.0 










<210> 1 
<211> 8422 
<212> DNA 
<213> Human 












<400> 1 
cttagagttt 


cgtggcttca 


gggtgggagt 


agttggagca 


ttggggatgt 


ttttcttacc 


60 


gacaagcaca 


gtcaggttga 


agacctaacc 


agggccagaa 


gtagctttgc 


acttttctaa 


120 


actaggctcc 


ttcaacaagg 


cttgctgcag 


atactactga 


ccagacaagc 


tgttgaccag 


180 


gcacctcccc 


tcccgcccaa 


acctttcccc 


catgtggtcg 


ttagagacag 


agcgacagag 


240 


cagttgagag 


gacactcccg 


ttttcggtgc 


catcagtgcc 


ccgtctacag 


ctcccccagc 


300 


tccccccacc 


tcccccactc 


ccaaccacgt 


tgggacaggg 


aggtgtgagg 


caggagagac 


360 


agttggattc 


tttagagaag 


atggatatga 


ccagtggcta 




gatcccaccc 


420 


gtggtggctc 


aagtctggcc 


ccacaccagc 


cccaatccaa 


aactggcaag 


gacgcttcac 


480 


aggacaggaa 


agtggcacct 


gtctgctcca 


gctctggcat 


ggctaggagg 


ggggagtccc 


540 


ttgaactact 


gggtgtagac 


tggcctgaac 


cacaggagag 


gatggcccag 


ggtgaggtgg 


600 


catggtccat 


tctcaaggga 


cgtcctccaa 


cgggtggcgc 


tagaggccat 


ggaggcagta 


660 


ggacaaggtg 


caggcaggct 


ggcctggggt 


caggccgggc 


agagcacagc 


ggggtgagag 


720 


ggattcctaa 


tcactcagag 


cagtctgtga 


cttagtggac 


aggggagggg 


gcaaaggggg 


780 


aggagaagaa 


aatgttcttc 


cagttacttt 


ccaattctcc 


tttagggaca 


gcttagaatt 


840 
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atttgcacta ttgagtcttc atgttcccac ttcaaaacaa acagatgctc tgagagcaaa 900 

ctggcttgaa ttggtgacat ttagtccctc aagccaccag atgtgacagt gttgagaact 960 

acctggattt gtatatatac ctgcgcttgt tttaaagtgg gctcagcaca tagggttccc 1020 

acgaagctcc gaaactctaa gtgtttgctg caattttata aggacttcct gattggtttc 1080 

tcttctcccc ttccatttct gccttttgtt catttcatcc tttcacttct ttcccttcct 1140 

ccgtcctcct ccttcctagt tcatcccttc tcttccaggc agccgcggtg cccaaccaca 1200 

cttgtcggct ccagtcccca gaactctgcc tgccctttgt cctcctgctg ccagtaccag 1260 

ccccaccctg ttttgagccc tgaggaggcc ttgggctctg ctgagtccaa cctggcctgt 1320 

ctgtgaagag caagagagca gcaaggtctt gctctcctag gtagccccct cttccctggt 1380 

aagaaaaagc aaaaggcatt tcccaccctg aacaacgagc cttttcaccc ttctactcta 1440 

gagaagtgga ctggaggagc tgggcccgat ttggtagttg aggaaagcac agaggcctcc 1500 

tgtggcctgc cagtcatcga gtggcccaac aggggctcca tgccagccga ccttgacctc 1560 

actcagaagt ccagagtcta gcgtagtgca gcagggcagt agcggtacca atgcagaact 1620 

cccaagaccc gagctgggac cagtacctgg gtccccagcc cttcctctgc tccccctttt 1680 

ccctcggagt tcttcttgaa tggcaatgtt ttgcttttgc tcgatgcaga cagggggcca 1740 

gaacaccaca catttcactg tctgtctggt ccatagctgt ggtgtagggg cttagaggca 1800 

tgggcttgct gtgggttttt aattgatcag ttttcatgtg ggatcccatc tttttaacct 1860 

ctgttcagga agtccttatc tagctgcata tcttcatcat attggtatat ccttttctgt 1920 

gtttacagag atgtctctta tatctaaatc tgtccaactg agaagtacct tatcaaagta 1980 

gcaaatgaga cagcagtctt atgcttccag aaacacccac aggcatgtcc catgtgagct 2040 

gctgccatga actgtcaagt gtgtgttgtc ttgtgtattt cagttattgt ccctggcttc 2100 

cttactatgg tgtaatcatg aaggagtgaa acatcataga aactgtctag cacttccttg 2160 

ccagtcttta gtgatcagga accatagttg acagttccaa tcagtagctt aagaaaaaac 2220 

cgtgtttgtc tcttctggaa tggttagaag tgagggagtt tgccccgttc tgtttgtaga 2280 

gtctcatagt tggactttct agcatatatg tgtccatttc cttatgctgt aaaagcaagt 234 0 

cctgcaacca aactcccatc agcccaatcc ctgatccctg atcccttcca cctgctctgc 2400 

tgatgacccc cccagcttca cttctgactc ttccccagga agggaagggg ggtcagaaga 24 60 

gagggtgagt cctccagaac tcttcctcca aggacagaag gctcctgccc ccatagtggc 2520 

ctcgaactcc tggcactacc aaaggacact tatccacgag agcgcagcat ccgaccaggt 2580 

Page 2 



012627-019. ST25 



tgtcactgag 


aagatgttta 


ttttggtcag 


ttgggttttt 


atgtattata 


cttagtcaaa 


2640 


tgtaatgtgg 


cttctggaat 


cattgtccag 


agctgcttcc 


ccgtcacctg 


ggcgtcatct 


2700 


ggtcctggta 


agaggagtgc 


gtggcccacc 


aggcccccct 


gtcacccatg 


acagttcatt 


2760 


cagggccgat 


ggggcagtcg 


tggttgggaa 


cacagcattt 


caagcgtcac 


tttatttcat 


2820 


tcgggcccca 


cctgcagctc 


cctcaaagag 


gcagttgccc 


agcctctttc 


ccttccagtt 


2880 


tattccagag 


ctgccagtgg 


ggcctgaggc 


tccttagggt 


tttctctcta 


tttccccctt 


2940 


tcttcctcat 


tccctcgtct 


ttcccaaagg 


catcacgagt 


cagtcgcctt 


tcagcaggca 


3000 


gccttggcgg 


tttatcgccc 


tggcaggcag 


gggccctgca 


gctctcatgc 


tgcccctgcc 


3060 


ttggggtcag 


gttgacagga 


ggttggaggg 


aaagccttaa 


gctgcaggat 


tctcaccagc 


3120 


tgtgtccggc 


ccagttttgg 


ggtctgacct 


caatttcaat 


tttgtctgta 


cttgaacatt 


3180 


atgaagatgg 


gggcctcttt 


cagtgaattt 


gtgaacagca 


gaattgaccg 


acagctttcc 


3240 


agtacccatg 


gggctaggtc 


attaaggcca 


catccacagt 


ctcccccacc 


cttgttccag 


3300 


ttgttagtta 


ctacctcctc 


tcctgacaat 


actgtatgtc 


gtcgagctcc 


ccccaggtct 


3360 


acccctcccg 


gccctgcctg 


ctggtgggct 


tgtcatagcc 


agtgggattg 


ccggtcttga 


3420 


cagctcagtg 


agctggagat 


acttggtcac 


agccaggcgc 


tagcacagct 


cccttctgtt 


3480 


gatgctgtat 


tcccatatca 


aaaggcacag 


gggacaccca 


gaaacgccac 


atcccccaat 


3540 


ccatcagtgc 


caaactagcc 


aacggcccca 


gcttctcagc 


tcgctggatg 


gcggaagctg 


3600 


ctactcgtga 


gcgccagtgc 


gggtgcagac 


aatcttctgt 


tgggtggcat 


cattccaggc 


3660 


ccgaagcatg 


aacagtgcac 


ctgggacagg 


gagcagcccc 


aaattgtcac 


ctgcttctct 


3720 


gcccagcttt 


tcattgctgt 


gacagtgatg 


gcgaaagagg 


gtaataacca 


gacacaaact 


3780 


gccaagttgg 


gtggagaaag 


gagtttcttt 


agctgacaga 


atctctgaat 


tttaaatcac 


3840 


ttagtaagcg 


gctcaagccc 


aggagggagc 


agagggatac 


gagcggagtc 


ccctgcgcgg 


3900 


gaccatctgg 


aattggttta 


gcccaagtgg 


agcctgacag 


ccagaactct 


gtgtcccccg 


3960 


tctaaccaca 


gctccttttc 


cagagcattc 


cagtcaggct 


ctctgggctg 


actgggccag 


4020 


gggaggttac 


aggtaccagt 


tctttaagaa 


gatctttggg 


catatacatt 


tttagcctgt 


4080 


gtcattgccc 


caaatggatt 


cctgtttcaa 


gttcacacct 


gcagattcta 


ggacctgtgt 


4140 


cctagacttc 


agggagtcag 


ctgtttctag 


agttcctacc 


atggagtggg 


tctggaggac 


4200 


ctgcccggtg 


ggggggcaga 


gccctgctcc 


ctccgggtct 


tcctactctt 


ctctctgctc 


4260 


tgacgggatt 


tgttgattct 


ctccattttg 


gtgtctttct 


cttttagata 


ttgtatcaat 


4320 
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ctttagaaaa ggcatagtct acttgttata 
taaaattaca tattagaggg gaaaagctga 
ggaaaaccta gaaaacattt ggcagaaaat 
caagctttta caacagtgct gatctaaaaa 
agcattacag gcaaggggaa tctggaggta 
tcttttggga gtggtatgga aggtggagcg 
ccctccttcc cactcttctc atcttgacag 
acacccaggg aactaggaag gcacttctgc 
ctctgctcgc ctcggatcag ctgagccttc 
cccctgcctg ccctgtcagg aggcagaagg 
gcacaacccc cagctcccgc tccgggctcc 
gaggaaatcc tacctttgaa ttcaagaaca 
aaacccccat tctgtcctac ctttaatcag 
gaaaaggcca agaggtttgg ctcctgccca 
tgtcaagtgg caaagctgtt cttcctggtg 
gtgcgcatag gcctgctttg tctcctctat 
cttttagttt ttctgtccct tttatttaac 
ttttatatat atatctgtat attgcacaat 
acacaaaaaa agacctgtta aaattatacc 
atagcatagg acaagggaaa ataaaaaaag 
ctgctggtca cttcttctgt ccaagcagat 
ctttcctgtg ccaggtgaag gaggctccag 
cccgtgcttg tgaaagaggt cccaaggttc 
aagtccggaa cgtagtcggc acagcctggt 
tggggtggcc tgactccccc agtccccttc 
agtcagcctc gcaggcctcc ctgccagtag 
gtcgagcctg ggggctggag cggagacggg 
gcaccaggta gaacgccagg gaccccagaa 
aggagtagtg aagactccag aaatgtccct 



aatcgttagg atactgcctc ccccagggtc 4380 

acactgaagt cagttctcaa caatttagaa 4440 

tacatttcga tgtttttgaa tgaatacaag 4500 

tacttagcac ttggcctgag atgcctggtg 4560 

gccgacctga ggacatggct tctgaacctg 4620 

ttcaccagtg acctggaagg cccagcacca 4680 

agcctgcccc agcgctgacg tgtcaggaaa 4740 

ctgaggggca gcctgccttg cccactcctg 4800 

tgagctggcc tctcactgcc tccccaaggc 4860 

aagcaggtgt gagggcagtg caaggaggga 4920 

gacttgtgca caggcagagc ccagaccctg 4980 

tttggggaat ttggaaatct ctttgccccc 5040 

gtcctgctca gcagtgagag cagatgaggt 5100 

ctgatagccc ctctccccgc agtgtttgtg 5160 

accctgatta tatccagtaa cacatagact 5220 

cctgggcttt tgttttgctt tttagttttg 5280 

gcaccgacta gacacacaaa gcagttgaat 5340 

tataaactca ttttgcttgt ggctccacac 5400 

tgttgcttaa ttacaatatt tctgataacc 5460 

aaaaaaaaga aaaaaaaacg acaaatctgt 5520 

tcgtggtctt ttcctcgctt ctttcaaggg 5580 

gcagcaccca ggttttgcac tcttgtttct 5640 

tgggtgcagg agcgctccct tgacctgctg 5700 

cgccttccac ctctgggagc tggagtccac 5760 

ccgtgacctg gtcagggtga gcccatgtgg 5820 

ggtccgagtg tgtttcatcc ttcccactct 5880 

aggcctggcc tgtctcggaa cctgtgagct 5940 

tcatgtgcgt cagtccaagg ggtcccctcc 6000 

ttcttctccc ccatcctacg agtaattgca 6060 
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tttgcttttg taattcttaa tgagcaatat ctgctagaga gtttagctgt aacagttctt 6120 

tttgatcatc tttttttaat aattagaaac accaaaaaaa tccagaaact tgttcttcca 6180 

aagcagagag cattataatc accagggcca aaagcttccc tccctgctgt cattgcttct 6240 

tctgaggcct gaatccaaaa gaaaaacagc cataggccct ttcagtggcc gggctacccg 6300 

tgagcccttc ggaggaccag ggctggggca gcctctgggc ccacatccgg ggccagctcc 6360 

ggcgtgtgtt cagtgttagc agtgggtcat gatgctcttt cccacccagc ctgggatagg 6420 

ggcagaggag gcgaggaggc cgttgccgct gatgtttggc cgtgaacagg tgggtgtctg 6480 

cgtgcgtcca cgtgcgtgtt ttctgactga catgaaatcg acgcccgagt tagcctcacc 6540 

cggtgacctc tagccctgcc cggatggagc ggggcccacc cggttcagtg tttctgggga 6600 

gctggacagt ggagtgcaaa aggcttgcag aacttgaagc ctgctccttc ccttgctacc 6660 

acggcctcct ttccgtttga tttgtcactg cttcaatcaa taacagccgc tccagagtca 6720 

gtagtcaatg aatatatgac caaatatcac caggactgtt actcaatgtg tgccgagccc 6780 

ttgcccatgc tgggctcccg tgtatctgga cactgtaacg tgtgctgtgt ttgctcccct 6840 

tccccttcct tctttgccct ttacttgtct ttctggggtt tttctgtttg ggtttggttt 6900 

ggtttttatt tctccttttg tgttccaaac atgaggttct ctctactggt cctcttaact 6960 

gtggtgttga ggcttatatt tgtgtaattt ttggtgggtg aaaggaattt tgctaagtaa 7020 

atctcttctg tgtttgaact gaagtctgta ttgtaactat gtttaaagta attgttccag 7080 

agacaaatat ttctagacac tttttcttta caaacaaaag cattcggagg gagggggatg 7140 

gtgactgaga tgagagggga gagctgaaca gatgacccct gcccagatca gccagaagcc 7200 

acccaaagca gtggagccca ggagtcccac tccaagccag caagccgaat agctgatgtg 7260 

ttgccacttt ccaagtcact gcaaaaccag gttttgttcc gcccagtgga ttcttgtttt 7320 

gcttcccctc cccccgagat tattaccacc atcccgtgct tttaaggaaa ggcaagattg 7380 

atgtttcctt gaggggagcc aggaggggat gtgtgtgtgc agagctgaag agctggggag 7440 

aatggggctg ggcccaccca agcaggaggc tgggacgctc tgctgtgggc acaggtcagg 7500 

ctaatgttgg cagatgcagc tcttcctgga caggccaggt ggtgggcatt ctctctccaa 7560 

ggtgtgcccc gtgggcatta ctgtttaaga cacttccgtc acatcccacc ccatcctcca 7620 

gggctcaaca ctgtgacatc tctattcccc accctcccct tcccagggca ataaaatgac 7680 

catggagggg gcttgcactc tcttggctgt cacccgatcg ccagcaaaac ttagatgtga 7740 

gaaaacccct tcccattcca tggcgaaaac atctccttag aaaagccatt accctcatta 7800 
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ggcatggttt 


tgggctccca 


aaacacctga 


cagcccctcc 


ctcctctgag 


aggcggagag 


7860 


tgctgactgt 


agtgaccatt 


gcatgccggg 


tgcagcatct 


ggaagagcta 


ggcagggtgt 


7920 


ctgccccctc 


ctgagttgaa 


gtcatgctcc 


cctgtgccag 


cccagaggcc 


gagagctatg 


7980 


gacagcattg 


ccagtaacac 


aggccaccct 


gtgcagaagg 


gagctggctc 


cagcctggaa 


8040 


acctgtctga 


ggttgggaga 


ggtgcacttg 


gggcacaggg 


agaggccggg 


acacacttag 


8100 


ctggagatgt 


ctctaaaagc 


cctgtatcgt 


attcaccttc 


agtttttgtg 


ttttgggaca 


8160 


attactttag 


aaaataagta 


ggtcgtttta 


aaaacaaaaa 


ttattgattg 


cttttttgta 


8220 


gtgttcagaa 


aaaaggttct 


ttgtgtatag 


ccaaatgact 


gaaagcactg 


atatatttaa 


8280 


aaacaaaagg 


caatttatta 


aggaaatttg 


taccatttca 


gtaaacctgt 


ctgaatgtac 


8340 


ctgtatacgt 


ttcaaaaaca 


cccccccccc 


actgaatccc 


tgtaacctat 


ttattatata 


8400 


aagagtttgc 


cttataaatt 


ta 








8422 



<210> 2 

<211> 8464 

<212> DNA 

<213> Murine 

<400> 2 



cttagagttt 


cgtggcttcg 


gggtgggagt 


agttggagca 


ttgggatgtt 


tttcttaccg 


60 


acaagcacag 


tcaggttgaa 


gacctaacca 


gggccagaag 


tagctttgca 


cttttctaaa 


120 


ctaggctcct 


tcaacaaggc 


ttgctgcaga 


tactactgac 


cagacaagct 


gttgaccagg 


180 


cactcccccc 


aacaatatcc 


tccctcttcc 


ccccccccac 


ccccgccccg 


tgtgctcgtt 


240 


agggcaattg 


aaaggacact 


cccatttttg 


gtgccattga 


tgccctgtcc 


ataatagctt 


300 


ccctgacttt 


tacaccaccc 


caactcccaa 


tctgaaggac 


tgggaggtgt 


gatgcaggag 


360 


aaactatggg 


actcttggga 


gaagactatg 


gagttggcca 


gtgattaagg 


cccactaatt 


420 


ccaactgtgg 


tagcacagat 


ctggctccac 


atcaacccaa 


tccaaaactg 


acaaggatat 


480 


tttgcaaaaa 


aagaaagtgg 


cacctgtctg 


atccagctct 


gacatggcta 


gaggtgagtc 


540 


ctaaactgat 


ggcttataaa 


ctagcctgag 


ccacagaaga 


gtatggccca 


gagtgaagtg 


600 


tcatcatctg 


ttcacaaggc 


atgctcccct 


agaagataat 


gctaaagagg 


tgccatggag 


660 


gcagcaggac 


aaagtacagg 


caggctaggt 


ggagtcaagc 


caggcctagt 


gccacagaac 


720 


aagagagcag 


tctgactagt 


aattaagagg 


gaagaaagga 


aaatattctt 


ccaattactt 


780 


tccagttctc 


ctttagggac 


agcttagaat 


tatttgcact 


attgagtctt 


catgttccca 


840 
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cttcaaaaca aacagatgct ctgaaagcaa actggcttga aatggtgaca ctgtcccaca 900 

agccaccaga catggcagtg ttcagaacta cctgtatctg tatatacctg cgcttgtttt 960 

aaagtgggct cagcacatag gattcccaag aagctccgaa actctaagtg tttgctgcaa 1020 

ttttataagg acttcctgat tgctttctct ctcgtccttc catttcttcc ttccttccat 1080 

ttcatgcttt catttcttcc cctagcttct agttgtttct tctgttccag gcagctgcag 1140 

tgctgaacca catggttacc taacagcagt cagctgcagc cctaggattc ttcctgccct 1200 

ttaacttccc attgccagtg ccaggtatca tatttaacct tgagcaagag ctgggctctt 1260 

ttgagccctc cctaacctct gtgaagaaga acaagaaggt aggaagctct tgctcttgct 1320 

aagaaaaatg tcaaaaggct ttcagacctt aaacaatgag ccttttcacc ttttactcta 1380 

gaaaagtgga ctagaaaatc tgggtcacat tgggtagctg aaggagatac agaggcccct 1440 

atggcctgcc agagtcgttg catggcccaa caggggctcc atgcccacta cccttgaccc 1500 

tactcagaaa tctaatgtca tacttagtgt gggcagggga cctgtcagga cagatgcaga 1560 

cctaagcagg gagtgacacc agggcccttg gcccttcttc tgacaaacat acacatccca 1620 

agtctttttc tagtggaatt cttaacctct tgctcactgg ggactgggaa gcatcagcac 1680 

atcccatatt tcaaactctg ctccataagt acagtggtga attttataga cttgactttg 1740 

ctgtggggtt ttaattggtc agttttaatt tgggatccca aagttttaac ctccattcag 1800 

gaagtcctta tctagctgca tatcttcatc atattggtat atccttttct gtgtttacag 1860 

agatgtctca tatctatcga aatctgtctg agaagtacct tatcaaagta gcaaatgaga 1920 

cagcagtctt atgcttccag aaacacccac aggcacgtcc catgtgagct gctgccatga 1980 

actgtcgagt gtgtattgtc ttgtgtattt tcgttaacgt tccccagctt ccttcctgcg 2040 

gtgtaatcat ggaagagtga aacatcatag aaatcgtcta gcacttcctg gccagtcctt 2100 

agtgatcagg aaccgtagtt gacagttcca attgatagct taagataaaa ccatgtttgt 2160 

ctcttatgga atggttagaa ctaagtgaga gatcttgccc cattctgttt gccgaatcat 2220 

agttggactt ttagtgtatt tgtatccatt tccttgtgct ataaaagcaa accctgcaac 2280 

cagctttctg tcaggcagtc cttttgcctg ctctgctttt gatcctctta gtcttgcttc 2340 

tggttcctcc ctggagaggg aggaggggtc agaagaggaa ttctggagga tccaggatat 2400 

gtccttctga actcctgctt cttccagtga caaaaggccc ctactgcccc accccaacct 2460 

gccccatgca ctcctctagg acacctttcc atacttttca caacacctag ccaggttgac 2520 

accaagttgt ttattgtggt ctgcttggaa ttttacctgt taggcttact tagtccaatc 2580 
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aaatggactc caagttgggt atccctcatc 
ttacttttgg gattgcagca ctttgggtgc 
gctccctcac caccaccacc accccccact 

caggtggtgg taactgcagc ttcttagggt 
cctcatccac aaataagggc atcacaagtc 
gtttttcccc tggaagccag ggaccctgtc 
aggaggttgg agggaaaagc cttaagtcat 
cctggaatgt gacctttatt ttgttgtatt 
aactgaatat gtgaagaatc cagaaactga 
cactaaggtc acatccagtc ttccctaccc 
gatagattgc tgtatatcct ccaactatga 
gtctgtctta accagtggaa ctgctgccct 
acagccaggc tctagtagta cagctccttt 
cacaggggag atctagaaat gccatctccc 
tcccagcatg ggtacagaca actctgttca 
cattggacgt gggaaccaga gcaacccgaa 
tctgacaatg ataaaacaag gcagtaactt 
gaaattcctt agctgacagc acctctggat 
ccatccagga aaaagcaaaa gggttagaac 
agcccaaatg gagtccagct gtctgaactc 
gcattccagc caggctcttc aggctgagga 
cttttgtgca tcagttcata gcctatatct 
tacagattct agggcagatg actgagactc 
gcgaagtaca aaaactgaag ggggctaggg 
gccctctgct ccatccacat ccttttctgg 
ccagccccac tctgaagaga tttgttgatt 
tactatatag aaaaggctta gtctaattgt 
ggtctaaaaa tatatgctaa aggggaaaac 
agaaggaaaa ccttgaaaac atttaacaaa 



tttggaagac aacctaggct gattagatat 2640 

cgtttttctt ttacttgggt tttatctgca 2700 

tacctgtatg tagaactgat ttcaaaactg 2760 

tttcttcact tcttgcttct ttccccattc 2820 

agtctccttt aagcaggcag ctttggtggg 2880 

aggctgcctc tgccttgtgg tcaggttgac 2940 

gggattctca ccagctgtgt ctggctcaga 3000 

tgaacattgt aaagtgtggg tggtacctta 3060 

ccaacagctt tcagatacct ggggctaggt 3120 

tgttctagtt gttagctact acctctccca 3180 

tcatcctggc ccaagcttgc ctgttcttga 3240 

tggtgtgcag tgagttgagg actcttggtc 3300 

ctgctggtgc tgtatttcca tatcaaaagg 3360 

ccagtccatc agtgccaaac aagcccatga 3420 

gtgctatcac aacagactag aggccatgaa 3480 

ttgctgctgc tttattcagc tttccgttgc 3540 

aaaacagact gccaggtttg gcagagaaag 3600 

tttaaatagg ttgtaataag tggctcaaac 3660 

tgaccagatg agaccagcct gatttcatgc 3720 

tgcagcactt ctctactaca gtctcctaga 3780 

gacatcacag gtgccagttc ttcaagaaga 3840 

ttgcccaaga ttgtagattc aggttaacac 3900 

agaaaaaaag cccctgtgga ctgtggtata 3960 

cagatgccgc atgcctcatg ccagagccaa 4020 

ctccttcttc ctgctctctg cttcagtgaa 4080 

ctctccattt ttatgtcttt ctcttttagg 4140 

tataaattgc tagaatactg cctcccccag 4200 

ttgaacactg aaaccagttc tgaacaattt 4260 

aaattatatt ttaatgttta tgaataagag 4320 
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gaggcttttg aaaaaatgtt gatctataaa tacttacttt aggcctgagg tgtctaatga 4380 

gtgaactgag caatgggaac tcaaggctga agcctcctgc atcagaggag gtagaaccag 4440 

gagcctcttg agatttgagg tgttttagca ttggaaagcc actctttggg tagctggccc 4500 

cagaaactac ttctgacctt gtcatttgga atggaggtta gtggtctgcc agatgccaaa 4560 

gctgcatgag accagctctt ggtttatcaa tttgaacact cagtaaccta gaaggcccag 4 620 

cacaaagtgt ctgctctctt cttaactgag cctgccccag cactactgca caaattaggg 4 680 

agggtctact tcctacagag catccctccc tgggccccct cccatccttt gtactctacc 4740 

tacctgacct tcaggatctt ggcacatacg aaatggctgt gtagcaagca ctttggcatg 4800 

ccctcctaaa cttaccccag agcctctccc tgcctcctta agccagtctg cctgtcttct 4860 

ggggaggtgt tagagcccat agaatggaga ggagaaagaa aagaggaaga ggcaggcagg 4 920 

tagtaaaaag gctctgggag gaaagacagc ctcctaggct ttgcacaagc aggactcagc 4980 

cccttgtggg aactaagtgc catcttggag tttaagaaca tttggacaag ttgcaaatga 5040 

cctttgctcc ttgctcctct caccttttat ggggccctgc ttagcactga aagcaaatgc 5100 

gctgaaaagg caaagaggtt tggctcctgc ccactgatag tcctttccct gcagtgtttg 5160 

tgtgtcaagt ggcaaagctg ttcttcctgg tgactctgat tagatccagt aacttaagag 5220 

atttgtatgc ataggtctgc tttgactctt ctattctggg cttttgattt gtttttcagt 5280 

tttgctttta gttttcctat ttttatttta tgcaccaact agacacacaa agcagttgaa 5340 

tttatatata tatatatata tatatatctg tatatttcac aattataaac tcattttgct 5400 

tgtgacgcca cacacacaca aaaagaaaaa ccttttaaaa ttatacctgt tgcttaatta 54 60 

caatatttct gataaccata gagtaggaca agggaaaaaa tttaaaaaaa aaaaaaaaaa 5520 

aagaaaaaac acatctgtct gctggtcact tcttcaatcc aagcagatct gtgatctttc 5580 

ctcgcgtctt tcaaagactt ccctgtgcta agtgaaggaa gctccaggct gcacccaggt 5640 

tttgtgcttt gtttctcctc tgttgtgaaa ggggccccaa gattctgggt acaggacagt 5700 

tcatttcagc atggggtcag gagacaagag cactcccttt acatgctgac gtacagaact 57 60 

tagtgggaat agcctagtcc ccacctctag ggatggggag ctagcatgca tgggggtgac 5820 

ccaactccct ccacctttcc ctggccagga agagcctgtg tacagtaagt ctgacaagct 5880 

ttccccagtt agcagggctc agagcattta aaaaccctcc aaactttgct gagtctaggg 5940 

actagagaga agatagaaga tttggtctat ctccaaggtg tgtaagctgt accaggtaga 6000 

atgccaggga ccccagaacc acatccaaca gcccaatggg tctcctccag aaagtagtga 6060 
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agactccaga aacatccctt tctcttctcc ctgctcccat gagtaactgc atttgctttt 6120 

gtaatcctta atgagcatta tctgctaaaa aaaaaaaatt agctgtaaca gttctttttg 6180 

caaaaggatc attcttaaat aattaaaaac accccccccc caaaaaaaag tccagaacct 6240 

tgttcttcca aagcagagag cattataatc agggccaaaa tctgtcccac acctctaccc 6300 

catctcctca tgattgctgc ttctaaggcc agaatacagc aaagatattt gtaggccctt 6360 

tgggtgactg ggctaccctt ggagctcttg gaagatgggc tggggaagcc tctgagaccc 6420 

tatcctaggg ccttgctcta gggagtaatc agtattagta gagtgtcaca acattattcc 6480 

ccagccggca tgagatgggg gcagaagaag ccaaagggtt gtctccactg ctacttactt 6540 

ggccactgac aggtaggtga ccatgtatgt ccatatgcat gttttatggc tgatgtgaga 6600 

tcagcaccca agttagcttc acctggtgac ctctaaccct gcctggatgg agcaggccac 6660 

ctggttcaat gtttctgggc agctggacaa tggagtgcaa aaggcttaca gaacttgaag 6720 

ccttttcctt actttgctag cacggcctcc ttttccattt gatttgtcac tgcttcagtc 6780 

aataacagcc gctccagagt cagtagttga tgaatatatg accaaatatc accaggactg 6840 

ttactcaacg tgtgccgagc cctttccttg tgctgggctc cctgtgtacc tggacactgt 6900 

aatgtgtgct gtgtttgctc tccttcctct tccttccttg ccctttcctt gtctttctgg 6960 

ggtttttctg ttgggtttgg tttggtttta tttttccttt tgtgttccaa acatgaggtt 7020 

ttctctactg gtcctcttta actgtggtgt tgaggcttct atttgtgtaa tttttggtgg 7080 

gtgaaaggaa ctttgctaag taaatctctt ctgtgtttga aatgaagtct gtattgtaac 7140 

tatgtttaaa gtaattgttc cagagacaaa tgcttctagg tacattttca ttacaaacaa 7200 

agcatttgaa gggagggaag tggtgaataa gacaagaggg gcaatctgaa ttgatccctg 7260 

cccagatcag ccagaagcta ccaaaagtta agcactggtt ttccattcca agtcaagaga 7320 

ctgaagctga tgttttgcca ttttcaaagt caaagcaaaa ccagcttttc cacccaatgg 7380 

attctttgct tctccttccc agattattac tactgctgta ataatctagg agtgccagga 7440 

gggaaaggag tattaacaca gagctgtgct cactgagtat ggaaaggctt ggtctgagtt 7500 

ttcaggagga tgacccactg tggacatggg gagaagacag aagataaatt agccgctccc 7560 

tgcctaagat acctcttaat agataagtca aggccatgga cattattgtc tacaaggcat 7620 

gtttcaaaga catgaccagt caggacactt ctgtcatact ccatgttgcc ccctagtaca 7680 

cagtactaat ctgatatctc tgttcccgcc atgcctgggg gataaaatga tagcagagac 7740 

tcctttcctt caatgtgatc taattcccaa caaaatctgg gcctgagata ccacctgttt 7800 
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ctatggcaaa 


catcctcagt 


aaagtgttat 


tctcattgca 


gattgttcca 


gcctaatgta 


7860 


agaggaacag 


agcagtgttc 


ccttggagcc 


tcatgtggac 


agttctacct 


gtagtgacca 


7920 


gttggctata 


gtagttatta 


gctggaacaa 


ccagacaggg 


tacatgcccc 


ctccaaaatc 


7980 


catgttgtac 


tcccctctgc 


cagccagggg 


gggtgagatc 


tgtagaatag 


tgcagccagt 


8040 


gacaagccac 


cttgtgtttg 


tcaccagctc 


aaaaactcat 


ctaaggttgg 


gagcaggcag 


8100 


acaaggcaga 


gagaaagatc 


caggacagac 


ctagctgggc 


tggaggggtc 


ttgaaaagcc 


8160 


ctctgtcgta 


ttcaccttca 


gtttttgtgc 


tttgggacaa 


ttactttaga 


aaataagtag 


8220 


gtcgttttaa 


aaacaaaata 


ttgattgctt 


ttttgtagtg 


ttcaaaacaa 


aaggttcttt 


8280 


gtgtatagcc 


aaatgactga 


aagcactgat 


atatttaaaa 


acaaaaggca 


atttattaag 


8340 


gaaatttgta 


ccatttcagt 


aaacctgtct 


gaatgtacct 


gtatacgttt 


caaaaacaca 


8400 


ccccactgaa 
==:. aaaa 


cccctgtaac 


ctatttatta 


tataaagagt 


ttgccttata 


aatttacata 


84 60 
8464 


II <210> 3 
= : l <211> 803 
i;i <212> DNA 
" <213> Hamster 












M <400> 3 
j ttgctgcaga 


tactactgac 


cagacaagct 


gttgaccagg 


caccccccca 


atactccccc 


60 


^; aatgtgctca 


ttagagatag 


cagttgagag 


gacactccca 


tttttggtgc 


cctgtccata 


120 


gcttccctga 


ctcttccacc 


accccaactc 


ccaatctgag 


ggaccgggag 


gtgcgaggca 


180 


3! ggaaaaatat 


tggattcttt 


agagaagact 


agaggtgacc 


agtgactgtg 


gcccagtaat 


240 


tagaactgtg 


gtggcacaag 


tctggcccca 


catccaccca 


atccaaaact 


gataaggata 


300 


ttttgaaaaa 


caggaaagca 


gtacctgtct 


gatccagctc 


tggtataggt 


aggagtgagt 


360 


cctgaactgc 


tggattacag 


actggcttga 


gccacagaag 


atgatggacc 


agagtaaagt 


420 


atcatcacct 


gctcacaagg 


catgcttcac 


tagagaataa 


ttctaaagag 


gtgccatgga 


480 


ggcagcagga 


caaggcacaa 


gcagtctggg 


tgggggtcaa 


gccagaccta 


gtgccacaga 


540 


acaagagagc 


aatctgtgac 


tagtagttag ggactttgtg 


gatgggacaa 


ggggcatggg 


600 


ggaagaaatg 


aaaatattct 


tccaattact 


ttccagttct 


cctttaggga 


cagcttagaa 


660 


ttatttgcac 


tattgagtct 


tcatgttccc 


acttaaaaac 


aaacagatgc 


tctgaaagca 


720 


aactggcttg 


aaatggtgac 


actttgtccc 


acaagccacc 


aaatgtggca 


gtgtttagaa 


780 
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<210> 4 

<211> 790 

<212> DNA 

<213> Kangaroo 



<400> 4 
ttgctgcata 


tactactgac 


cagacaagct 


gtttatcagg 


ctttttaggg 


tacaccagca 


60 


cctgccctcc 


attcatccct 


gttgggagag 


ggatggtgta 


ctggttgtca 


ctagagacct 


120 


aacagagtag 


ggttagtggg 


agcttacatt 


ttcagtgcca 


ttaacattct 


agtccaaggt 


180 


cttaaattat 


tatgttgagg 


ggtttttttt 


cccctgaggg 


ggccgggggg 


tggggggagg 


240 


gttgattaga 


ttccttagga 


aagagggttg 


agacagacag 


cagagcactg 


agcagttggc 


300 


actaaaggag 


accttgacta 


ggggccaggt 


ggcatcatct 


aatcccaagg 


ggctccaagt 


360 


gagtattagg 


gtgggggaag 


acattataga 


aggaatagaa 


acaggatagc 


tcagcctaaa 


420 


gaagagcggt 


taaaacccta 


cccaccagga 


gttgacttga 


aagaggcccc 


tatggaggaa 


480 


tccccaacca 


ccaaaagcaa 


tcttgagctg 


cagctgcttc 


atttagtgga 


ccttgtgtat 


540 


atctgggtgt 


gtatgcacat 


agatagacag 


tgagaaagaa 


aactgttctt 


ccagttcttt 


600 


tccagtgcta 


ctagcttagg 


gacaggttag 


aactgtctgc 


acaattgtgt 


gatcattccc 


660 


attcccactt 


caaaacaaac 


tgactgagat 


gttcaacaga 


aaactggctt 


caatgggtaa 


720 


catgcccttg 


ccacttactt 


aagacactgg 


tgtgatgggg 


ttttgaactc 


cctatatttg 


780 


taggtatctg 












790 


<210> 5 
<211> 841 
<212> DNA 
<213> Macaca 












<400> 5 
ttgctgcaga 


tactactgac 


cagacaagct 


gttgaccagg 


cacctcccct 


cccgcccaaa 


60 


cctttccccc 


atgtggtcgt 


tagagacaga 


cgagttgaga 


ggacactccc 


gttttcggtg 


120 


ccatcagtgc 


cccgtctacc 


actcccccag 


ctcccccact 


ctcccccact 


cccaaccacg 


180 


ttgggacagg 


gaggtgtgag 


gcaggagaga 


cagttggatt 


ctttagagat 


ggatgtgacc 


240 


agtggctatg 


gcccgtgcga 


tcccacccgt 


ggcggctcaa 


atctggcccc 


accccagccc 


300 


caatccaaaa 


ctggcaagga 


cgcttcacag 


gacaggaaag 
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tggcacctgt 


ctgttccggc 


360 



atggctagga 


gggagttgtc 


ccttgaacta 


ctgggtgtag 


actggcctaa 


atcacaggag 


420 


aggatggccc 


agggtgaggt 


ggcatggtcc 


attctcaagg 


gacgtcctcc 


agttggtggc 


480 


actagagagg 


ccatggaggc 


agtaggacaa 


ggcacaggca 


ggctggccca 


gggtcaggcc 


540 


gggccgaaca 


cagcggggtg 


agagggattc 


ctcgtctcag 


agcagtctgt 


gaccggtagt 


600 


tagggactta 


gtggacaggg 


aaggggcaaa 


gggggaggag 


aagaaaatgt 


tcttccagtt 


660 


actttccaat 


tctactcctt 


tagggacagc 


ttagaattat 


ttgcactatt 


gagtcttcat 


720 


gttcccactt 


caaaacaaac 


agatgctctg 


agagcaaact 


ggcttgaatt 


ggtgacgttt 


780 


agtccctcag 

g 


gccaccagat 


gtgatggtgt 


tgagaactac 


ctggatatgt 


atatatacct 


840 
841 


<210> 6 
<211> 846 
<212> DNA 
<213> Orangutan 












<400> 6 
ttgctgcaga 


tactactgac 


cagacaagct 


gttgaccagg 


cacctcccct 


cccgcccaaa 


60 


cctttccccc 


atgtggtcgt 


tagagacaga 


gcagttgaga 


ggacactccc 


gttttcggtg 


120 


ccatcagtgc 


cccgtctgca 


gctcccccag 


ctccccccac 


ctcccccact 


cccaaccacg 


180 


ttgggacagg 


gaggtgtgag 


gcaggagaga 


cagttggatt 


ctttcgagaa 


gatggatatg 


240 


accagtggcc 


atggcctgtg 


cgatcccacc 


cgtggcggct 


caagtctggc 


cccacaccag 


300 


ccccaatcca 


aaactggcaa 


ggacgcttca 


caggacagga 


aagtggcacc 


tgtctgctcc 


360 


agctctggca 


tggctaggag 


ggagtcgtcc 


cttgaactac 


tgggtgtaga 


ctggcctgaa 


420 


ccacaggaga 


ggatggccca 


gggtgaggtg 


gcatggtcca 


ttctcaaggg 


acgtcctcca 


480 


acgggtggcg 


ctagaaaggc 


catggaggca 


gtaggacaag 


gcgcaggcag 


gctggcccgg 


540 


ggtcaggccg 


ggcagggcac 


agcggggtga 


gagggattcc 


taatcactca 


gagcagtgtg 


600 


tgactggtag 


ttagggactc 


agtggacagg 


ggaggggcga 


gggggcagga 


gaagaaaatg 


660 


ttcttccagt 


tactttccaa 


ttctccttta 


gggacagctt 


agaattattt 


gcactattga 


720 


gtcttcatgt 


tcccacttca 


aaacaaacga 


tgctctgaga 


gcaaactggc 


ttgaattggt 


780 


gacatttagt 


ccctcaagcc 


accagatgtg 


agtgttgaga 


actacctgga 


tttgtatata 


840 
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<211> 813 
<212> DNA 
<213> Rat 



<400> 7 
ttgctgcaga 


tactactgac 


cagacaagct 


gttgaccagg 


cactccccac 


aacaacaacc 


60 


ccctccctcc 


tcaccccacc 


cctatcccct 


gtgtgctcat 


tagagagggc 


aattgagagg 


120 


acactcccat 


ttttggtgcc 


actgatgccc 


tgtccatagc 


ttccctgact 


tttacaccac 


180 


cccaactccc 


aatctgaggg 


actgggaggt 


gtgacgcagg 


agaaactata 


taggactctt 


240 


gggagaagac 


tatagagttg 


gcaagtgatt 


gcgccccagt 


aattccaact 


gtggtagcac 


300 


aagtctggct 


ccacaccaac 


ccaatccaaa 


actgacaagg 


acattttgca 


aaaaatgaaa 


360 


gtggcatttg 


tctgatccag 


ctctggcatg 


gctagagatg 


agtcttaaac 


tgttggctta 


420 


taaactggcc 


tgagcaacag 


aagaggatgg 


cccagagtaa 


agtgtcatca 


tctgttcaca 


480 


aggcatgctc 


ccctagaagt 


tcatgctaaa 


gaagtgccat 


ggaggcagca 


ggacaaagta 


540 


caggctaggt 


ggagtcaagc 


caggcctagt 


gccacagagc 


aagagagcag 


tctctgacta 


600 


gtagttaagg 


gggaagaaag 


aaaaatattc 


ttccaattgc 


tttccagttc 


tcctttaggg 


660 


acagcttaga 


attatttgca 


ctattgagtc 


ttcatgttcc 


cacttcaaaa 


caaatagatg 


720 


ctctgaaagc 


aaactggctt 


gaaatggtga 


cactgtccca 


caagccacca 


gacaatggca 


780 


gtgttcagaa 


ctacctgtat 


atgtatatac 


ctg 






813 


<210> 8 
<211> 842 
<212> DNA 
<213> Chimpanzee 












<400> 8 
ttgctgcaga 


tactactgac 


cagacaagct 


gttgaccagg 


cacctcccct 


cccgcccaaa 


60 


cctttccccc 


atgtggtcgt 


tagagacaga 


gcgacagagc 


agttgagagg 


acactcccgt 


120 


tttcggtgcc 


atcagtgccc 


cgtctacagc 


tcccccagct 


ccccccacct 


cccccactcc 


180 


caaccacgtt 


gggacaggga 


ggtgtgaggc 


aggagagaca 


gttggattct 


ttagagaaga 


240 


tggatatgac 


cagtggctat 


ggcctgtgtg 


atcccacccg 


tggtggctca 


agtctggccc 


300 


cacaccagcc 


ccaatccaaa 


actggcaagg 


acgcttcaca 


ggacaggaaa 


gtggcacctg 


360 


tctgctccag 


ctctggcatg 


gctaggaggg 


gggagtccct 


tgaactactg 


ggtgtagact 


420 


ggcctgaacc 


acaggagagg 


atggcccagg 


gtgaggtggc 


gtggtccatt 


ctcaagggac 


480 
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gtcctccaac gggtggcgct agaggccatg 
gcccggggtc aggccgggca gagcacagcg 
agtctgtgac ttagtggaca ggggaggggg 
agttactttc caattctcct ttagggacag 
tgttcccact tcaaaacaaa cagatgctct 
tagtccctca agccaccaga tgtgacagtg 

tg 



gaggcagtag gacaaggcgc aggcaggctg 540 

gggtgagagg gattcctaat cactcagagc 600 

caaaggggga ggagaagaaa atgttcttcc 660 

cttagaatta tttgcactat tgagtcttca 720 

gagagcaaac tggcttgaat tggtgacatt 780 

ttgagaacta cctggatttg tatatatacc 840 

842 
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SEQUENCE PROTOCOL v X - / 



m teMPCVFIO 2 2 DEC » 



(1) GENERAL INDICATIONS: 



(i) APPLICANT: 

(A) NAME: Deutsches Krebsf orschungszentrum 

(B) STREET: Im Neuenheimer Feld 280 

(C) TOWN: Heidelberg 

(E) COUNTRY: Germany 

(F) POSTA1 CODE: 69120 

(ii) TITLE OF THE INVENTION: Modularly Constructed RNA 

Molecules Having Two Sequence Region Types 

(iii) NUMBER OF SEQUENCES: 8 

(iv) COMPUTER-READABLE VERSION: 

(A) DATA CARRIER: floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SORTWARE: Patentln Release #1.0, version #1.30 

(EPO) 

(v) DATA OF THE CURRENT APPLICATION: not yet known 

(vi) DATA OF THE PRIOR APPLICATION: 
APPLICATION NUMBER: DE 198 28 624.4 
FILING DATE: June 26, 1998 

(2) INDICATIONS AS TO ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8422 base pairs 

(B) KIND: nucleotide 

(C) STRAND FORM: not known 

(D) TOPOLOGY: not known 

(ii) KIND OF MOLECULE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

CTTAGAGTTT CGTGGCTTCA GGGTGGGAGT AGTTGGAGCA TTGGGGATGT TTTTCTTACC 60 

GACAAGCACA GTCAGGTTGA AGACCTAACC AGGGCCAGAA GTAGCTTTGC ACTTTTCTAA 12 0 

ACTAGGCTCC TTCAACAAGG CTTGCTGCAG ATACTACTGA CCAGACAAGC TGTTGACCAG 180 

GCACCTCCCC TCCCGCCCAA ACCTTTCCCC CATGTGGTCG TTAGAGACAG AGCGACAGAG 240 

CAGTTGAGAG GACACTCCCG TTTTCGGTGC CATCAGTGCC CCGTCTACAG CTCCCCCAGC 300 

TCCCCCCACC TCCCCCACTC CCAACCACGT TGGGACAGGG AGGTGTGAGG CAGGAGAGAC 360 

AGTTGGATTC TTTAGAGAAG ATGGATATGA CCAGTGGCTA TGGCCTGTGC GATCCCACCC 420 

GTGGTGGCTC AAGTCTGGCC CCACACCAGC CCCAATCCAA AACTGGCAAG GACGCTTCAC 480 

AGGACAGGAA AGTGGCACCT GTCTGCTCCA GCTCTGGCAT GGCTAGGAGG GGGGAGTCCC 540 

TTGAACTACT GGGTGTAGAC TGGCCTGAAC CACAGGAGAG GATGGCCCAG GGTGAGGTGG 600 

CATGGTCCAT TCTCAAGGGA CGTCCTCCAA CGGGTGGCGC T AG AGGC CAT GGAGGCAGTA 660 



2 



GGACAAGGTG 


CAGGCAGGCT 


GGCCTGGGGT 


CAGGCCGGGC AGAGCACAGC GGGGTGAGAG 


720 


GGATTCCTAA 


TCACTCAGAG 


CAGTCTGTGA 


CTTAGTGGAC AGGGGAGGGG GCAAAGGGGG 


780 


AGGAGAAGAA 


AATGTTCTTC 


CAGTTACTTT 


CCAATTCTCC TTTAGGGACA GCTTAGAATT 


840 


ATTTGCACTA 


TTGAGTCTTC 


ATGTTCCCAC 


TTCAAAACAA ACAGATGCTC TGAGAGCAAA 


900 


CTGGCTTGAA 


TTGGTGACAT 


TTAGTCCCTC 


AAGCCACCAG ATGTGACAGT GTTGAGAACT 


960 


ACCTGGATTT 


GTATATATAC 


CTGCGCTTGT 


TTTAAAGTGG GCTCAGCACA TAGGGTTCCC 


1020 


ACGAAGCTCC 


GAAACTCTAA 


GTGTTTGCTG 


CAATTTTATA AGGACTTCCT GATTGGTTTC 


1080 


TCTTCTCCCC 


TTCCATTTCT 


GCCTTTTGTT 


CATTTCATCC TTTCACTTCT TTCCCTTCCT 


1140 


CCGTCCTCCT 


CCTTCCTAGT 


TCATCCCTTC 


TCTTCCAGGC AGCCGCGGTG CCCAACCACA 


1200 


CTTGTCGGCT 


CCAGTCCCCA GAACTCTGCC 


TGCCCTTTGT CCTCCTGCTG CCAGTACCAG 


1260 


CCCCACCCTG 


TTTTGAGCCC 


TGAGGAGGCC 


TTGGGCTCTG CTGAGTCCAA CCTGGCCTGT 


1320 


CTGTGAAGAG 


CAAGAGAGCA 


GCAAGGTCTT 


GCTCTCCTAG GTAGCCCCCT CTTCC CTGGT 


1380 


AAGAAAAAGC 


AAAAGGCATT 


TCCCACCCTG 


AACAACGAGC CTTTTCACCC TTCTACTCTA 


1440 


GAGAAGTGGA 


CTGGAGGAGC 


TGGGC CCGAT 


TTGGTAGTTG AGGAAAGCAC AGAGGCCTCC 


1500 


TGTGGCCTGC CAGTCATCGA GTGGCCCAAC AGGGGCTCCA TGCCAGCCGA CCTTGACCTC 


1560 


ACTCAGAAGT 


CCAGAGTCTA 


GCGTAGTGCA 


GCAGGGCAGT AGCGGTACCA ATGCAGAACT 


1620 


CCCAAGACCC 


GAGCTGGGAC 


CAGTACCTGG 


GTCCCCAGCC CTTCCTCTGC TCCCCCTTTT 


1680 


CCCTCGGAGT 


TCTTCTTGAA 


TGGCAATGTT 


TTGCTTTTGC TCGATGCAGA CAGGGGGCCA 


1740 


GAAC AC C AC A 


CATTTCACTG 


TCTGTCTGGT 


CCATAGCTGT GGTGTAGGGG CTTAGAGGCA 


1800 


TGGGCTTGCT 


GTGGGTTTTT 


AATTGATCAG 


TTTTCATGTG GGATCCCATC TTTTTAACCT 


1860 


CTGTTCAGGA 


AGTCCTTATC 


TAGCTGCATA 


TCTTCATCAT ATTGGTATAT CCTTTTCTGT 


1920 


GTTTACAGAG 


ATGTCTCTTA 


TATCTAAATC 


TGTCCAACTG AGAAGTACCT TATCAAAGTA 


1980 


GCAAATGAGA 


CAGCAGTCTT 


ATGCTTCCAG 


AAACACCCAC AGGCATGTCC CATGTGAGCT 


2040 


GCTGCCATGA ACTGTCAAGT 


GTGTGTTGTC 


TTGTGTATTT CAGTTATTGT CCCTGGCTTC 


2100 


CTTACTATGG 


TGTAATCATG 


AAGGAGTGAA 


ACATCATAGA AACTGTCTAG CACTTCCTTG 


2160 


CCAGTCTTTA 


GTGATCAGGA 


ACCATAGTTG 


ACAGTTCCAA TCAGTAGCTT AAGAAAAAAC 


2220 


CGTGTTTGTC 


TCTTCTGGAA 


TGGTTAGAAG 


TGAGGGAGTT TGCCCCGTTC TGTTTGTAGA 


2280 


GTCTCATAGT 


TGGACTTTCT 


AGCATATATG 


TGTCCATTTC CTTATGCTGT AAAAGCAAGT 


2340 


CCTGCAACCA 


AACTCCCATC 


AGCCCAATCC 


CTGATCCCTG ATCCCTTCCA CCTGCTCTGC 


2400 


TGATGACCCC 


CCCAGCTTCA 


CTTCTGACTC 


TTCCCCAGGA AGGGAAGGGG GGTCAGAAGA 


2460 


GAGGGTGAGT 


CCTCCAGAAC 


TCTTCCTCCA 


AGGACAGAAG GCTCCTGCCC CCATAGTGGC 


2520 


CTCGAACTCC 


TGGCACTACC 


AAAGGACACT 


TATCCACGAG AGCGCAGCAT CCGACCAGGT 


2580 


TGTCACTGAG 


AAGATGTTTA 


TTTTGGTCAG 


TTGGGTTTTT ATGTATTATA CTTAGTCAAA 


2640 


TGTAATGTGG 


CTTCTGGAAT 


CATTGTCCAG 


AGCTGCTTCC CCGTCACCTG GGCGTCATCT 


2700 
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GGTCCTGGTA 


AGAGGAGTGC 


GTGGCCCACC 


AGGCCCCCCT 


GTCACCCATG ACAGTTCATT 


2760 


CAGGGCCGAT 


GGGGCAGTCG 


TGGTTGGGAA 


CACAGCATTT 


CAAGCGTCAC TTTATTTCAT 


2820 


TCGGGCCCCA 


CCTGCAGCTC 


CCTCAAAGAG 


GCAGTTGCCC 


AGCCTCTTTC CCTTCCAGTT 


2880 


TATTCCAGAG 


CTGCCAGTGG 


GGCCTGAGGC 


TCCTTAGGGT 


TTTCTCTCTA TTTCCCCCTT 


2940 


TCTTCCTCAT 


TCCCTCGTCT 


TTCCCAAAGG 


CATCACGAGT 


CAGTCGCCTT TCAGCAGGCA 


3000 


GCCTTGGCGG 


TTTATCGCCC 


TGGCAGGCAG 


GGGCCCTGCA 


GCTCTCATGC TGCCCCTGCC 


3060 


TTGGGGTCAG 


GTTGACAGGA 


GGTTGGAGGG 


AAAGCCTTAA 


GCTGCAGGAT TCTCACCAGC 


3120 


TGTGTCCGGC 


CCAGTTTTGG 


GGTCTGACCT 


CAATTTCAAT 


TTTGTCTGTA CTTGAACATT 


3180 


ATGAAGATGG 


GGGCCTCTTT 


CAGTGAATTT 


GTGAACAGCA 


GAATTGACCG ACAGCTTTCC 


3240 


AGTACCCATG 


GGGCTAGGTC 


ATTAAGGCCA 


CATCCACAGT 


CTCCCCCACC CTTGTTCCAG 


3300 


TTGTTAGTTA 


CTACCTCCTC 


TCCTGACAAT 


ACTGTATGTC 


GTCGAGCTCC CCCCAGGTCT 


3360 


ACCCCTCCCG 


GCCCTGCCTG 


CTGGTGGGCT 


TGTCATAGCC 


AGTGGGATTG CCGGTCTTGA 


3420 


CAGCTCAGTG 


AGCTGGAGAT 


ACTTGGTCAC 


AGCCAGGCGC 


TAGCACAGCT CCCTTCTGTT 


3480 


GATGCTGTAT 


TCCCATATCA 


AAAGGCACAG 


GGGACACCCA 


GAAACGCCAC ATCCCCCAAT 


3540 


CCATCAGTGC 


CAAACTAGCC 


AACGGCCCCA 


GCTTCTCAGC 


TCGCTGGATG GCGGAAGCTG 


3600 


CTACTCGTGA 


GCGCCAGTGC 


GGGTGCAGAC 


AATCTTCTGT 


TGGGTGGCAT CATTCCAGGC 


3660 


CCGAAGCATG 


AACAGTGCAC 


CTGGGACAGG 


GAGCAGCCCC 


AAATTGTCAC CTGCTTCTCT 


3720 


GCCCAGCTTT TCATTGCTGT GACAGTGATG 


GCGAAAGAGG 


GTAATAACCA GACACAAACT 


3780 


GCCAAGTTGG 


GTGGAGAAAG 


GAGTTTCTTT 


AGCTGACAGA 


ATCTCTGAAT TTTAAATCAC 


3840 


TTAGTAAGCG 


GCTCAAGCCC 


AGGAGGGAGC 


AGAGGGATAC 


GAGCGGAGTC CCCTGCGCGG 


3900 


GACCATCTGG 


AATTGGTTTA 


GCCCAAGTGG 


AGCCTGACAG 


CCAGAACTCT GTGTCCCCCG 


3960 


TCTAACCACA 


GCTCCTTTTC 


CAGAGCATTC 


CAGTCAGGCT 


CTCTGGGCTG ACTGGGCCAG 


4020 


GGGAGGTTAC 


AGGTACCAGT 


TCTTTAAGAA 


GATCTTTGGG 


CATATACATT TTTAGCCTGT 


4080 


GTCATTGCCC 


CAAATGGATT 


CCTGTTTCAA 


GTTCACACCT 


GCAGATTCTA GGACCTGTGT 


4140 


CCTAGACTTC 


AGGGAGTCAG 


CTGTTTCTAG AGTTCCTACC 


ATGGAGTGGG TCTGGAGGAC 


4200 


CTGCCCGGTG 


GGGGGGCAGA 


GCCCTGCTCC 


CTCCGGGTCT 


TCCTACTCTT CTCTCTGCTC 


4260 


TGACGGGATT 


TGTTGATTCT 


CTCCATTTTG 


GTGTCTTTCT 


CTTTTAGATA TTGTATCAAT 


4320 


CTTTAGAAAA 


GGCATAGTCT 


ACTTGTTATA 


AATCGTTAGG 


ATACTGCCTC CCCCAGGGTC 


4380 


TAAAATTACA 


TATTAGAGGG 


GAAAAGCTGA 


ACACTGAAGT 


CAGTTCTCAA CAATTTAGAA 


4440 


GGAAAACCTA 


GAAAACATTT 


GGCAGAAAAT 


TACATTTCGA 


TGTTTTTGAA TGAATACAAG 


4500 


CAAGCTTTTA 


CAACAGTGCT 


GATCTAAAAA 


TACTTAGCAC 


TTGGCCTGAG ATGCCTGGTG 


4560 


AGCATTACAG 


GCAAGGGGAA 


TCTGGAGGTA 


GCCGACCTGA 


GGACATGGCT TCTGAACCTG 


4620 


TCTTTTGGGA 


. GTGGTATGGA AGGTGGAGCG 


TTCACCAGTG 


ACCTGGAAGG CCCAGCACCA 


4680 


CCCTCCTTCC 


CACTCTTCTC 


ATCTTGACAG 


AGCCTGCCCC 


! AGCGCTGACG TGTCAGGAAA 


4740 
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ACACCCAGGG 


AACTAGGAAG 


GCACTTCTGC 


C TGAGGGGC A 


GCCTGCCTTG 


CCCACTCCTG 


4800 


CTCTGCTCGC 


CTCGGATCAG 


CTGAGCCTTC 


TGAGCTGGCC 


TCTCACTGCC 


TCCCCAAGGC 


4860 


CCCCTGCCTG 


CCCTGTCAGG 


AGGCAGAAGG 


AAGCAGGTGT 


GAGGGCAGTG 


CAAGGAGGGA 


4920 


GCACAACCCC 


CAGCTCCCGC 


TCCGGGCTCC 


GACTTGTGCA 


CAGGCAGAGC 


CCAGACCCTG 


4980 


GAGGAAATCC 


TACCTTTGAA 


TTCAAGAACA 


TTTGGGGAAT 


TTGGAAATCT 


CTTTGCCCCC 


5040 


AAACCCCCAT 


TCTGTCCTAC 


CTTTAATCAG 


GTCCTGCTCA 


GCAGTGAGAG 


CAGATGAGGT 


5100 


GAAAAGGCCA 


AGAGGTTTGG 


CTCCTGCCCA 


CTGATAGCCC 


CTCTCCCCGC 


AGTGTTTGTG 


5160 


TGTCAAGTGG 


CAAAGCTGTT 


CTTCCTGGTG 


ACCCTGATTA 


TATCCAGTAA 


CACATAGACT 


5220 


GTGCGCATAG 


GCCTGCTTTG 


TCTCCTCTAT 


CCTGGGCTTT 


TGTTTTGCTT 


TTTAGTTTTG 


5280 


CTTTTAGTTT 


TTCTGTCCCT 


TTTATTTAAC 


GCACCGACTA 


GACACACAAA 


GCAGTTGAAT 


5340 


TTTTATATAT 


ATATCTGTAT 


ATTGCACAAT TATAAACTCA 


TTTTGCTTGT 


GGCTCCACAC 


5400 


ACACAAAAAA 


AGACCTGTTA 


AAATTATACC 


TGTTGCTTAA 


TTACAATATT 


TCTGATAACC 


5460 


ATAGCATAGG 


ACAAGGGAAA 


ATAAAAAAAG 


AAAAAAAAGA 


AAAAAAAACG 


ACAAATCTGT 


5520 


CTGCTGGTCA 


CTTCTTCTGT 


CCAAGCAGAT 


TC GTGGTCTT 


TTCCTCGCTT 


CTTTCAAGGG 


5580 


CTTTCCTGTG 


CCAGGTGAAG 


GAGGCTCCAG 


GCAGCACCCA 


GGTTTTGCAC 


TCTTGTTTCT 


5640 


CCCGTGCTTG 


TGAAAGAGGT 


CCCAAGGTTC 


TGGGTGCAGG 


AGCGCTCCCT 


TGACCTGCTG 


5700 


AAGTCCGGAA 


CGTAGTCGGC 


ACAGCCTGGT 


CGCCTTCCAC 


CTCTGGGAGC 


TGGAGTCCAC 


5760 


TGGGGTGGCC 


TGACTCCCCC 


AGTCCCCTTC 


CCGTGACCTG 


GTCAGGGTGA 


GCCCATGTGG 


5820 


AGTCAGCCTC 


GCAGGCCTCC 


CTGCCAGTAG 


GGTCCGAGTG 


TGTTTCATCC 


TTCCCACTCT 


5880 


GTCGAGCCTG 


GGGGCTGGAG 


CGGAGACGGG 


AGGCCTGGCC 


TGTCTCGGAA 


CCTGTGAGCT 


5940 


GCACCAGGTA 


GAACGCCAGG 


GACCCCAGAA 


TCATGTGCGT 


CAGTCCAAGG 


GGTCCCCTCC 


6000' 


AGGAGTAGTG AAGACTCCAG AAATGTCCCT TTCTTCTCCC CCATCCTACG AGTAATTGCA 


6060 


TTTGCTTTTG 


TAATTCTTAA 


TGAGCAATAT 


CTGCTAGAGA 


GTTTAGCTGT 


AACAGTTCTT 


6120 


TTTGATCATC 


TTTTTTTAAT 


AATTAGAAAC 


ACCAAAAAAA 


TCCAGAAACT 


TGTTCTTCCA 


6180 


AAGCAGAGAG 


CATTATAATC 


ACCAGGGCCA 


AAAGCTTCCC 


TCCCTGCTGT 


CATTGCTTCT 


6240 


TCTGAGGCCT 


GAATCCAAAA 


GAAAAACAGC 


CATAGGCCCT 


TTCAGTGGCC 


GGGCTACCCG 


6300 


TGAGCCCTTC 


GGAGGACCAG 


GGCTGGGGCA 


GCCTCTGGGC 


CCACATCCGG 


GGCCAGCTCC 


6360 


GGCGTGTGTT 


CAGTGTTAGC 


AGTGGGTCAT 


GATGCTCTTT 


CCCACCCAGC 


CTGGGATAGG 


6420 


GGCAGAGGAG 


GCGAGGAGGC 


CGTTGCCGCT 


GATGTTTGGC 


CGTGAACAGG 


TGGGTGTCTG 


6480 


CGTGCGTCCA 


CGTGCGTGTT 


TTCTGACTGA 


CATGAAATCG ACGCCCGAGT 


TAGCCTCACC 


6540 


CGGTGACCTC 


TAGCCCTGCC 


CGGATGGAGC 


GGGGCCCACC 


CGGTTCAGTG 


TTTCTGGGGA 


6600 


GCTGGACAGT 


GGAGTGCAAA 


AGGCTTGCAG 


AACTTGAAGC 


CTGCTCCTTC 


CCTTGCTACC 


6660 


ACGGCCTCCT 


TTCCGTTTGA 


TTTGTCACTG 


CTTCAATCAA 


TAACAGCCGC 


TCCAGAGTCA 


6720 


GTAGTCAATG 


AATATATGAC 


CAAATATCAC 


CAGGACTGTT 


ACTCAATGTG 


TGCCGAGCCC 


6780 
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TTGCCCATGC 


TGGGCTCCCG 


TGTATCTGGA 


CACTGTAACG 


TGTGCTGTGT 


TTGCTCCCCT 


6840 


TCCCCTTCCT 


TCTTTGCCCT 


TTACTTGTCT 


TTCTGGGGTT 


TTTCTGTTTG 


GGTTTGGTTT 


6900 


GGTTTTTATT 


TCTCCTTTTG 


TGTTCCAAAC 


ATGAGGTTCT 


CTCTACTGGT 


CCTCTTAACT 


6960 


GTGGTGTTGA 


GGCTTATATT 


TGTGTAATTT 


TTGGTGGGTG 


AAAGGAATTT 


TGCTAAGTAA 


7020 


ATCTCTTCTG 


TGTTTGAACT 


GAAGTCTGTA 


TTGTAACTAT 


GTTTAAAGTA 


ATTGTTCCAG 


7080 


AGACAAATAT 


TTCTAGACAC 


TTTTTCTTTA 


CAAACAAAAG 


CATTCGGAGG 


GAGGGGGATG 


7140 


GTGACTGAGA 


TGAGAGGGGA GAGCTGAACA GATGACCCCT 


GCCCAGATCA 


GCCAGAAGCC 


7200 


ACCCAAAGCA 


GTGGAGCCCA 


GGAGTCC C AC 


TCCAAGCCAG 


CAAGCCGAAT 


AGCTGATGTG 


7260 


TTGCCACTTT 


CCAAGTCACT 


GCAAAACCAG 


GTTTTGTTCC 


GCCCAGTGGA 


TTCTTGTTTT 


7320 


GCTTCCCCTC 


CCCCCGAGAT 


TATTACCACC 


ATCCCGTGCT 


TTTAAGGAAA 


GGCAAGATTG 


7380 


ATGTTTCCTT 


GAGGGGAGCC 


AGGAGGGGAT 


GTGTGTGTGC 


AGAGCTGAAG 


AGCTGGGGAG 


7440 


AATGGGGCTG 


GGCCCACCCA 


AGCAGGAGGC 


TGGGACGCTC 


TGCTGTGGGC 


ACAGGTCAGG 


7500 


CTAATGTTGG 


CAGATGCAGC 


TCTTCCTGGA 


CAGGCCAGGT 


GGTGGGCATT 


CTCTCTCCAA 


7560 


GGTGTGCCCC 


GTGGGCATTA 


CTGTTTAAGA 


CACTTCCGTC 


ACATCCCACC 


CCATCCTCCA 


7620 


GGGCTCAACA 


CTGTGACATC 


TCTATTCCCC 


ACCCTCCCCT 


TCCCAGGGCA 


ATAAAATGAC 


7680 


CATGGAGGGG 


GCTTGCACTC 


TCTTGGCTGT 


CACCCGATCG 


CCAGCAAAAC 


TTAGATGTGA 


7740 


GAAAACCCCT 


TCCCATTCCA 


TGGCGAAAAC 


ATCTCCTTAG 


AAAAGCCATT 


ACCCTCATTA 


7800 


GGCATGGTTT 


TGGGCTC CCA 


AAACACCTGA 


CAGCCCCTCC 


CTCCTCTGAG 


AGGCGGAGAG 


7860 


TGCTGACTGT 


AGTGACCATT 


GCATGCCGGG 


TGCAGCATCT 


GGAAGAGCTA 


GGCAGGGTGT 


7920 


CTGCCCCCTC 


CTGAGTTGAA GTCATGCTCC 


CCTGTGCCAG 


CCCAGAGGCC 


GAGAGCTATG 


7980 


GACAGCATTG 


CCAGTAACAC 


AGGCCACCCT 


GTGCAGAAGG 


GAGCTGGCTC 


CAGCCTGGAA 


8040 


ACCTGTCTGA 


GGTTGGGAGA GGTGCACTTG 


GGGCACAGGG 


AGAGGCCGGG 


ACACACTTAG 


8100 


CTGGAGATGT 


CTCTAAAAGC 


CCTGTATCGT 


ATTCACCTTC 


AGTTTTTGTG 


TTTTGGGACA 


8160 


ATTACTTTAG 


AAAATAAGTA 


GGTCGTTTTA 


AAAACAAAAA 


TTATTGATTG 


CTTTTTTGTA 


8220 


GTGTTCAGAA 


AAAAGGTTCT 


TTGTGTATAG 


CCAAATGACT 


GAAAGCACTG 


ATATATTTAA 


8280 


AAACAAAAGG 


CAATTTATTA 


AGGAAATTTG 


TACCATTTCA 


GTAAACCTGT 


CTGAATGTAC 


8340 


CTGTATACGT 


TTCAAAAACA CCCCCCCCCC 


ACTGAATCCC 


TGTAACCTAT 


TTATTATATA 


8400 


AAGAGTTTGC 


CTTATAAATT 


TA 








8422 


(2) INDICATIONS AS 


TO ID NO: 


2: 








(i) 


SEQUENCE 


CHARACTERISTICS : 









(A) LENGTH: 8464 amino acids 

(B) KIND: nucleotide 

(C) STRAND FORM: not known 

(D) TOPOLOGY: not known 
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(ii) KIND OF MOLECULE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



CTTAGAGTTT 


CGTGGCTTCG 


GGGTGGGAGT 


AGTTGGAGCA 


TTGGGATGTT 


TTTCTTACCG 


60 


ACAAGCACAG 


TCAGGTTGAA 


GACCTAACCA 


GGGCCAGAAG 


TAGCTTTGCA 


CTTTTCTAAA 


120 


CTAGGCTCCT 


TCAACAAGGC 


TTGCTGCAGA 


TACTACTGAC 


CAGACAAGCT 


GTTGACCAGG " 


180 


CACTCCCCCC 


AACAATATCC 


TCCCTCTTCC 


CCCCCCCCAC 


CCCCGCCCCG 


TGTGCTCGTT 


240 


AGGGCAATTG 


AAAGGACACT 


CCCATTTTTG 


GTGCCATTGA 


TGCCCTGTCC 


ATAATAGCTT 


300 


CCCTGACTTT 


TACACCACCC 


CAACTCCCAA 


TCTGAAGGAC 


TGGGAGGTGT 


GATGCAGGAG 


360 


AAACTATGGG 


ACTCTTGGGA 


GAAGACTATG 


GAGTTGGCCA 


GTGATTAAGG 


CCCACTAATT 


420 


CCAACTGTGG 


TAGCACAGAT 


CTGGCTCCAC 


ATCAACCCAA 


TCCAAAACTG 


ACAAGGATAT 


480 


TTTGCAAAAA 


AAGAAAGTGG 


CACCTGTCTG 


ATCCAGCTCT 


GACATGGCTA 


GAGGTGAGTC 


540 


CTAAACTGAT 


GGCTTATAAA 


CTAGCCTGAG 


CCACAGAAGA 


GTATGGCCCA 


GAGTGAAGTG 


600 


TCATCATCTG 


TTCACAAGGC 


ATGCTCCCCT 


AGAAGATAAT 


GC T AAAGAGG 


TGCCATGGAG 


660 


GCAGCAGGAC 


AAAGTACAGG 


CAGGCTAGGT 


GGAGTCAAGC 


CAGGCCTAGT 


GCCACAGAAC 


720 


AAGAGAGCAG 


TCTGACTAGT 


AATTAAGAGG 


GAAGAAAGGA 


AAATATTCTT 


CCAATTACTT 


780 


TCCAGTTCTC 


CTTTAGGGAC 


AGCTTAGAAT 


TATTTGCACT 


ATTGAGTCTT 


CATGTTCCCA 


840 


CTTCAAAACA 


AACAGATGCT 


CTGAAAGCAA 


ACTGGCTTGA 


AATGGTGACA 


CTGTCCCACA 


900 


AGCCACCAGA 


CATGGCAGTG 


TTCAGAACTA 


CCTGTATCTG 


TATATACCTG 


CGCTTGTTTT 


960 


AAAGTGGGCT 


CAGCACATAG 


GATTCCCAAG 


AAGCTCCGAA 


ACTCTAAGTG 


TTTGCTGCAA 


1020 


TTTTATAAGG 


ACTTCCTGAT 


TGCTTTCTCT 


CTCGTCCTTC 


CATTTCTTCC 


TTCCTTCCAT 


1080' 


TTCATGCTTT 


CATTTCTTCC 


CCTAGCTTCT 


AGTTGTTTCT 


TCTGTTCCAG 


GCAGCTGCAG 


1140 


TGCTGAACCA 


CATGGTTACC 


TAACAGCAGT 


CAGCTGCAGC 


CCTAGGATTC 


TTCCTGCCCT 


1200 


TTAACTTCCC 


ATTGCCAGTG 


CCAGGTATCA 


TATTTAACCT 


TGAGCAAGAG 


CTGGGCTCTT 


1260 


TTGAGCCCTC 


CCTAACCTCT 


GTGAAGAAGA 


ACAAGAAGGT 


AGGAAGCTCT 


TGCTCTTGCT 


1320 


AAGAAAAATG 


TCAAAAGGCT 


TTCAGACCTT 


AAACAATGAG 


CCTTTTCACC 


TTTTACTCTA 


1380 


GAAAAGTGGA 


CTAGAAAATC 


TGGGTCACAT 


TGGGTAGCTG 


AAGGAGATAC 


AGAGGCCCCT 


1440 


ATGGCCTGCC 


AGAGTCGTTG 


CATGGCCCAA 


CAGGGGCTCC 


ATGCCCACTA 


CCCTTGACCC 


1500 


TACTCAGAAA 


TCTAATGTCA 


TACTTAGTGT 


GGGCAGGGGA 


CCTGTCAGGA 


CAGATGCAGA 


1560 


CCTAAGCAGG 


GAGTGAC AC C 


AGGGCCCTTG 


GCCCTTCTTC 


TGACAAACAT 


ACACATCCCA 


1620 


AGTCTTTTTC 


TAGTGGAATT 


CTTAAC CTC T 


TGCTCACTGG 


GGACTGGGAA 


GCATCAGCAC 


1680 


ATCCCATATT 


TCAAACTCTG 


CTCCATAAGT 


ACAGTGGTGA 


ATTTTATAGA 


CTTGACTTTG 


1740 


CTGTGGGGTT 


TTAATTGGTC 


AGTTTTAATT 


TGGGATCCCA 


AAGTTTTAAC 


CTCCATTCAG 


1800 


GAAGTCCTTA 


TCTAGCTGCA 


TATCTTCATC 


ATATTGGTAT 


ATCCTTTTCT 


GTGTTTACAG 


1860 
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AGATGTCTCA TATCTATCGA AATCTGTCTG AGAAGTACCT TATCAAAGTA GCAAATGAGA 1920 

CAGCAGTCTT ATGCTTCCAG AAACACCCAC AGGCACGTCC CATGTGAGCT GCTGCCATGA 1980 

ACTGTCGAGT GTGTATTGTC TTGTGTATTT TCGTTAACGT TCCCCAGCTT CCTTCCTGCG 2040 

GTGTAATCAT GGAAGAGTGA AACATCATAG AAATCGTCTA GCACTTCCTG GCCAGTCCTT 2100 

AGTGATCAGG AACCGTAGTT GACAGTTCCA ATTGATAGCT TAAGATAAAA CCATGTTTGT 2160 

CTCTTATGGA ATGGTTAGAA CTAAGTGAGA GATCTTGCCC CATTCTGTTT GCCGAATCAT 2220 

AGTTGGACTT TTAGTGTATT TGTATCCATT TCCTTGTGCT ATAAAAGCAA ACCCTGCAAC 2280 

CAGCTTTCTG TCAGGCAGTC CTTTTGCCTG CTCTGCTTTT GATCCTCTTA GTCTTGCTTC 2340 

TGGTTCCTCC CTGGAGAGGG AGGAGGGGTC AGAAGAGGAA TTCTGGAGGA TCCAGGATAT 2400 

GTCCTTCTGA ACTCCTGCTT CTTCCAGTGA CAAAAGGCCC CTACTGCCCC ACCCCAACCT 2460 

GCCCCATGCA CTCCTCTAGG ACACCTTTCC ATACTTTTCA CAACACCTAG CCAGGTTGAC 2520 

ACCAAGTTGT TTATTGTGGT CTGCTTGGAA TTTTACCTGT TAGGCTTACT TAGTCCAATC 2580 

AAATGGACTC CAAGTTGGGT ATCCCTCATC TTTGGAAGAC AACCTAGGCT GATTAGATAT 2640 

TTACTTTTGG GATTGCAGCA CTTTGGGTGC CGTTTTTCTT TTACTTGGGT TTTATCTGCA 27 00 

GCTCCCTCAC CACCACCACC ACCCCCCACT TACCTGTATG TAGAACTGAT TTCAAAACTG 2760 

CAGGTGGTGG TAACTGCAGC TTCTTAGGGT TTTCTTCACT TCTTGCTTCT TTCCCCATTC 2820 

CCTCATCCAC AAATAAGGGC ATCACAAGTC AGTCTCCTTT AAGCAGGCAG CTTTGGTGGG 2880 

GTTTTTCCCC TGGAAGCCAG GGACCCTGTC AGGCTGCCTC TGCCTTGTGG TCAGGTTGAC 2940 

AGGAGGTTGG AGGGAAAAGC CTTAAGTCAT GGGATTCTCA CCAGCTGTGT CTGGCTCAGA 3000 

CCTGGAATGT GACCTTTATT TTGTTGTATT TGAACATTGT AAAGTGTGGG TGGTACCTTA 3060 

AACTGAATAT GTGAAGAATC CAGAAACTGA CCAACAGCTT TCAGATACCT GGGGCTAGGT 3120 

C AC TAAGGTC ACATCCAGTC TTCCCTACCC TGTTCTAGTT GTTAGCTACT ACCTCTCCCA 318 0 

GATAGATTGC TGTATATCCT CCAACTATGA TCATCCTGGC CCAAGCTTGC CTGTTCTTGA 3240 

GTCTGTCTTA ACCAGTGGAA CTGCTGCCCT TGGTGTGCAG TGAGTTGAGG ACTCTTGGTC 3300 

ACAGCCAGGC TCTAGTAGTA CAGCTCCTTT CTGCTGGTGC TGTATTTCCA TATCAAAAGG 3360 

CACAGGGGAG ATCTAGAAAT GCCATCTCCC CCAGTCCATC AGTGCCAAAC AAGCCCATGA 3420 

TCCCAGCATG GGTACAGACA ACTCTGTTCA GTGCTATCAC AACAGACTAG AGGCCATGAA 3480 

CATTGGACGT GGGAACCAGA GCAACCCGAA TTGCTGCTGC TTTATTCAGC TTTCCGTTGC 3540 

TCTGACAATG ATAAAACAAG GCAGTAACTT AAAACAGACT GCCAGGTTTG GCAGAGAAAG 3600 

GAAATTCCTT AGCTGACAGC ACCTCTGGAT TTTAAATAGG TTGTAATAAG TGGCTCAAAC 3660 

CCATCCAGGA AAAAGCAAAA GGGTTAGAAC TGACCAGATG AGACCAGCCT GATTTCATGC 3720 

AGCCCAAATG GAGTCCAGCT GTCTGAACTC TGCAGCACTT CTCTACTACA GTCTCCTAGA 3780 

GGATTCCAGC CAGGCTCTTC AGGCTGAGGA GACATCACAG GTGCCAGTTC TTCAAGAAGA 3840 

CTTTTGTGCA TCAGTTCATA GCCTATATCT TTGCCCAAGA TTGTAGATTC AGGTTAACAC 3900 
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TACAGATTCT 


AGGGCAGATG 


ACTGAGACTC 


AGAAAAAAAG 


CCCCTGTGGA 


CTGTGGTATA 


3960 


GCGAAGTACA 


AAAACTGAAG 


GGGGCTAGGG 


CAGATGCCGC 


ATGCCTCATG 


CCAGAGCCAA 


4020 


GCCCTCTGCT 


CCATCCACAT 


CCTTTTCTGG 


CTCCTTCTTC 


CTGCTCTCTG 


CTTCAGTGAA 


4080 


CCAGCCCCAC 


TCTGAAGAGA 


TTTGTTGATT 


CTCTCCATTT 


TTATGTCTTT 


CTC TTTTAGG 


4140 


TACTATATAG 


AAAAGGCTTA 


GTCTAATTGT 


TATAAATTGC 


TAGAATACTG 


CCTCCCCCAG 


4200 


GGTCTAAAAA 


TATATGCTAA 


AGGGGAAAAC 


TTGAACACTG 


AAACCAGTTC 


TGAACAATTT" 


4260 


AGAAGGAAAA 


CCTTGAAAAC 


ATTTAACAAA 


AAATTATATT 


TTAATGTTTA 


TGAATAAGAG 


4320 


GAGGCTTTTG 


AAAAAATGTT 


GATCTATAAA 


TACTTACTTT 


AGGCCTGAGG 


TGTCTAATGA 


4380 


GTGAACTGAG 


CAATGGGAAC 


TCAAGGCTGA 


AGCCTCCTGC 


ATCAGAGGAG 


GTAGAACCAG 


4440 


GAGCCTCTTG 


AGATTTGAGG 


TGTTTTAGCA 


TTGGAAAGCC 


ACTCTTTGGG 


TAGCTGGCCC 


4500 


CAGAAACTAC 


TTCTGACCTT 


GTCATTTGGA 


ATGGAGGTTA 


GTGGTCTGCC 


AGATGCCAAA 


4560 


GCTGCATGAG 


ACCAGCTCTT 


GGTTTATCAA 


TTTGAACACT 


CAGTAACCTA 


GAAGGCCCAG 


4620 


CACAAAGTGT 


CTGCTCTCTT 


CTTAACTGAG 


CCTGCCCCAG 


CACTACTGCA 


CAAATTAGGG 


4680 


AGGGTCTACT 


TCCTACAGAG 


CATCCCTCCC 


TGGGCCCCCT 


CCCATCCTTT 


GTACTCTACC 


4740 


TACCTGACCT 


TCAGGATCTT 


GGCACATACG 


AAATGGCTGT 


GTAGCAAGCA 


CTTTGGCATG 


4800 


CCCTCCTAAA 


CTTACCCCAG 


AGCCTCTCCC 


TGCCTCCTTA 


AGCCAGTCTG 


CCTGTCTTCT 


4860 


GGGGAGGTGT 


TAGAGCCCAT 


AGAATGGAGA 


GGAGAAAGAA 


AAGAGGAAGA 


GGCAGGCAGG 


4920 


TAGTAAAAAG 


GCTCTGGGAG 


GAAAGACAGC 


CTCCTAGGCT 


TTGCACAAGC 


AGGACTCAGC 


4980 


CCCTTGTGGG 


AACTAAGTGC 


CATCTTGGAG 


TTTAAGAACA 


TTTGGACAAG 


TTGCAAATGA 


5040 


CCTTTGCTCC 


TTGCTCCTCT 


CACCTTTTAT 


GGGGCCCTGC 


TTAGCACTGA 


AAGCAAATGC 


5100 


GCTGAAAAGG 


CAAAGAGGTT 


TGGCTCCTGC 


CCACTGATAG 


TCCTTTCCCT 


GCAGTGTTTG 


5160 


TGTGTCAAGT 


GGCAAAGCTG 


TTCTTCCTGG 


TGACTCTGAT 


TAGATCCAGT 


AACTTAAGAG 


5220 


ATTTGTATGC 


ATAGGTCTGC 


TTTGACTCTT 


CTATTCTGGG 


CTTTTGATTT 


GTTTTTCAGT 


5280 


TTTGCTTTTA 


GTTTTCCTAT 


TTTTATTTTA 


TGCACCAACT 


AGACACACAA 


AGCAGTTGAA 


5340 


TTTATATATA 


TATATATATA 


TATATATCTG 


TATATTTCAC 


AATTATAAAC 


TCATTTTGCT 


5400 


TGTGACGCCA 


CACACACACA 


AAAAGAAAAA 


CCTTTTAAAA 


TTATACCTGT 


TGCTTAATTA 


5460 


CAATATTTCT 


GATAACCATA 


GAGTAGGACA 


AGGGAAAAAA 


TTTAAAAAAA 


AAAAAAAAAA 


5520 


AAGAAAAAAC 


ACATCTGTCT 


GCTGGTCACT 


TCTTCAATCC 


AAGCAGATCT 


GTGATCTTTC 


5580 


CTCGCGTCTT 


TCAAAGACTT 


CCCTGTGCTA 


AGTGAAGGAA 


GCTCCAGGCT 


GCACCCAGGT 


5640 


TTTGTGCTTT 


GTTTCTCCTC 


TGTTGTGAAA 


GGGGCCCCAA 


GATTCTGGGT 


ACAGGACAGT 


5700 


TCATTTCAGC 


ATGGGGTCAG 


GAGACAAGAG 


CACTCCCTTT 


ACATGCTGAC 


GTACAGAACT 


5760 


TAGTGGGAAT 


AGCCTAGTCC 


CCACCTCTAG 


GGATGGGGAG 


CTAGCATGCA 


TGGGGGTGAC 


5820 


CCAACTCCCT 


CCACCTTTCC 


CTGGCCAGGA 


AGAGCCTGTG 


TACAGTAAGT 


CTGACAAGCT 


5880 


TTCCCCAGTT 


AGCAGGGCTC 


AGAGCATTTA AAAACCCTCC 


AAACTTTGCT 


GAGTCTAGGG 


5940 
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ACTAGAGAGA 


AGATAGAAGA TTTGGTCTAT 


CTCCAAGGTG 


TGTAAGCTGT 


ACCAGGTAGA 


6000 


ATGCCAGGGA 


CCCCAGAACC 


ACATCCAACA 


GCCCAATGGG 


TCTCCTCCAG 


AAAGTAGTGA 


6060 


AGACTCCAGA 


AACATCCCTT 


TCTCTTCTCC 


CTGCTCCCAT 


GAGTAACTGC 


ATTTGCTTTT 


6120 


GTAATCCTTA 


ATGAGCATTA 


TCTGCTAAAA 


AAAAAAAATT 


AGCTGTAACA 


GTTCTTTTTG 


6180 


CAAAAGGATC 


ATTCTTAAAT 


AATTAAAAAC 


ACCCCCCCCC 


CAAAAAAAAG 


TCCAGAACCT 


6240 


TGTTCTTCCA 


AAGCAGAGAG 


CATTATAATC 


AGGGCCAAAA 


TCTGTCCCAC 


ACCTCTACCC- 


6300 


CATCTCCTCA 


TGATTGCTGC 


TTCTAAGGCC 


AGAATACAGC 


AAAGATATTT 


GTAGGCCCTT 


6360 


TGGGTGACTG 


GGCTACCCTT 


GGAGCTCTTG 


GAAGATGGGC 


TGGGGAAGCC 


TCTGAGACCC 


6420 


TATCCTAGGG 


CCTTGCTCTA 


GGGAGTAATC 


AGTATTAGTA 


GAGTGTCACA ACATTATTCC 


6480 


CCAGCCGGCA 


TGAGATGGGG 


GCAGAAGAAG 


CCAAAGGGTT 


GTCTCCACTG 


CTACTTACTT 


6540 


GGCCACTGAC 


AGGTAGGTGA 


CCATGTATGT 


CCATATGCAT 


GTTTTATGGC 


TGATGTGAGA 


6600 


TCAGCACCCA 


AGTTAGCTTC 


ACCTGGTGAC 


CTCTAACCCT 


GCCTGGATGG 


AGCAGGCCAC 


6660 


CTGGTTCAAT 


GTTTCTGGGC 


AGCTGGACAA 


TGGAGTGCAA 


AAGGCTTACA 


GAACTTGAAG 


6720 


CCTTTTCCTT 


ACTTTGCTAG 


CACGGCCTCC 


TTTTCCATTT 


GATTTGTCAC 


TGCTTCAGTC 


6780 


AATAACAGCC 


GCTCCAGAGT 


CAGTAGTTGA 


TGAATATATG 


ACCAAATATC 


ACCAGGACTG 


6840 


TTACTCAACG 


TGTGCCGAGC 


CCTTTCCTTG 


TGCTGGGCTC 


CCTGTGTACC 


TGGACACTGT 


6900 


AATGTGTGCT 


GTGTTTGCTC 


TCCTTCCTCT 


TCCTTCCTTG 


CCCTTTCCTT 


GTCTTTCTGG 


6960 


GGTTTTTCTG 


TTGGGTTTGG 


TTTGGTTTTA 


TTTTTCCTTT 


TGTGTTCCAA 


ACATGAGGTT 


7020 


TTCTCTACTG 


GTCCTCTTTA 


ACTGTGGTGT 


TGAGGCTTCT 


ATTTGTGTAA 


TTTTTGGTGG 


7080 


GTGAAAGGAA 


CTTTGCTAAG 


TAAATCTCTT 


CTGTGTTTGA 


AATGAAGTCT 


GTATTGTAAC 


7140 


TATGTTTAAA 


GTAATTGTTC 


CAGAGACAAA 


TGCTTCTAGG 


TACATTTTCA 


TTACAAACAA 


7200 


AGCATTTGAA 


GGGAGGGAAG 


TGGTGAATAA 


GACAAGAGGG 


GCAATCTGAA 


TTGATCCCTG 


7260 


CCCAGATCAG 


CCAGAAGCTA 


CCAAAAGTTA AGCACTGGTT 


TTCCATTCCA 


AGTCAAGAGA 


7320 


CTGAAGCTGA 


TGTTTTGCCA 


TTTTCAAAGT 


CAAAGCAAAA CCAGCTTTTC 


CACCCAATGG 


7380 


ATTCTTTGCT 


TCTCCTTCCC 


AGATTATTAC 


TACTGCTGTA 


ATAATCTAGG 


AGTGCCAGGA 


7440 


GGGAAAGGAG 


TATTAACACA 


GAGCTGTGCT 


CACTGAGTAT 


GGAAAGGCTT 


GGTCTGAGTT 


7500 


TTCAGGAGGA 


TGACCCACTG 


TGGACATGGG 


GAGAAGACAG 


AAGATAAATT 


AGCCGCTCCC 


7560 


TGCCTAAGAT 


ACCTCTTAAT 


AGATAAGTCA 


AGGCCATGGA 


CATTATTGTC 


TACAAGGCAT 


7620 


GTTTCAAAGA 


CATGACCAGT 


CAGGACACTT 


CTGTCATACT 


CCATGTTGCC 


CCCTAGTACA 


7680 


CAGTACTAAT 


CTGATATCTC 


TGTTCCCGCC 


ATGCCTGGGG 


GATAAAATGA 


TAGCAGAGAC 


7740 


TCCTTTCCTT 


CAATGTGATC 


TAATTCCCAA 


CAAAATCTGG 


GCCTGAGATA 


CCACCTGTTT 


7800 


CTATGGCAAA 


CATCCTCAGT 


AAAGTGTTAT 


TCTCATTGCA 


GATTGTTCCA 


GCCTAATGTA 


7860 


AGAGGAACAG 


AGCAGTGTTC 


CCTTGGAGCC 


TCATGTGGAC 


AGTTCTACCT 


GTAGTGACCA 


7920 


GTTGGCTATA 


GTAGTTATTA 


GCTGGAACAA 


CCAGACAGGG 


TACATGCCCC 


CTCCAAAATC 


7980 
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CATGTTGTAC TCCCCTCTGC CAGCCAGGGG GGGTGAGATC TGTAGAATAG TGCAGCCAGT 8040 

GACAAGCCAC CTTGTGTTTG TCACCAGCTC AAAAACTCAT CTAAGGTTGG GAGCAGGCAG 8100 

ACAAGGCAGA GAGAAAGATC CAGGACAGAC CTAGCTGGGC TGGAGGGGTC TTGAAAAGCC 8160 

CTCTGTCGTA TTCACCTTCA GTTTTTGTGC TTTGGGACAA TTACTTTAGA AAATAAGTAG 8220 

GTCGTTTTAA AAACAAAATA TTGATTGCTT TTTTGTAGTG TTCAAAACAA AAGGTTCTTT 8280 

GTGTATAGCC AAATGAC TG A AAGCACTGAT ATATTTAAAA ACAAAAGGCA ATTTATTAAG 8340 

GAAATTTGTA CCATTTCAGT AAACCTGTCT GAATGTACCT GTATACGTTT CAAAAACACA 8400 

CCCCACTGAA CCCCTGTAAC CTATTTATTA TATAAAGAGT TTGCCTTATA AATTTACATA 8460 

AAAA 8464 



(2) INDICATIONS AS TO ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 03 base pairs 

(B) KIND: nucleotide 

(C) STRAND FORM: not known 

(D) TOPOLOGY: not known 

(ii) KIND OF MOLECULE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



TTGCTGCAGA 


TACTACTGAC 


CAGACAAGCT 


GTTGACCAGG 


CACCCCCCCA 


ATACTCCCCC 


60 


AATGTGCTCA 


TTAGAGATAG 


CAGTTGAGAG 


GACACTCCCA 


TTTTTGGTGC 


CCTGTCCATA 


12 0 


GCTTCCCTGA 


CTCTTCCACC 


ACCCCAACTC 


CCAATCTGAG 


GGACCGGGAG 


GTGCGAGGCA 


180 


GGAAAAATAT 


TGGATTCTTT 


AGAGAAGACT 


AGAGGTGACC 


AGTGACTGTG 


GCCCAGTAAT 


240 


TAGAACTGTG 


GTGGCACAAG 


TCTGGCCCCA 


CATCCACCCA 


ATCCAAAACT 


GATAAGGATA 


300 


TTTTGAAAAA 


CAGGAAAGCA 


GTACCTGTCT 


GATCCAGCTC 


TGGTATAGGT 


AGGAGTGAGT 


360 


CCTGAACTGC 


TGGATTACAG 


ACTGGCTTGA 


GCCACAGAAG 


ATGATGGACC 


AGAGTAAAGT 


420 


ATCATCACCT 


GCTCACAAGG 


CATGCTTCAC 


TAGAGAATAA 


TTCTAAAGAG 


GTGCCATGGA 


480 


GGCAGCAGGA 


CAAGGCACAA 


GCAGTCTGGG 


TGGGGGTCAA 


GCCAGACCTA 


GTGCCACAGA 


540 


ACAAGAGAGC 


AATCTGTGAC 


TAGTAGTTAG 


GGACTTTGTG 


GATGGGACAA 


GGGGCATGGG 


600 


GGAAGAAATG 


AAAATATTCT 


TCCAATTACT 


TTCCAGTTCT 


CCTTTAGGGA 


CAGCTTAGAA 


660 


TTATTTGCAC 


TATTGAGTCT 


TCATGTTCCC 


ACTTAAAAAC 


AAACAGATGC 


TCTGAAAGCA 


720 


AACTGGCTTG 


AAATGGTGAC 


AC TTTGTCCC 


ACAAGCCACC 


AAATGTGGCA 


GTGTTTAGAA 


780 


CTACCTGGAT 


CTGTATATAC 


CTG 








803 



(2) INDICATIONS AS TO ID NO: 4: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 790 base pairs 

(B) KIND: nucleotide 

(C) STRAND FORM: not known 

(D) TOPOLOGY: not known 



(ii) KIND OF MOLECULE: cDNA 



(xi) SEQUENCE DESCRIPTION: 


SEQ ID NO: 


4: 






TTGCTGCATA 


TACTACTGAC 


CAGACAAGCT 


GTTTATCAGG 


CTTTTTAGGG 


TACACCAGCA 


60 


CCTGCCCTCC 


ATTC ATC C CT 


GTTGGGAGAG 


GGATGGTGTA 


CTGGTTGTCA 


CTAGAGACCT 


120 


AACAGAGTAG 


GGTTAGTGGG 


AGCTTACATT 


TTCAGTGCCA 


TTAACATTCT 


AGTCCAAGGT 


180 


CTTAAATTAT 


TATGTTGAGG 


GGTTTTTTTT 


CCCCTGAGGG 


GGCCGGGGGG 


TGGGGGGAGG 


240 


GTTGATTAGA 


TTCCTTAGGA 


AAGAGGGTTG 


AGACAGACAG 


CAGAGCACTG 


AGCAGTTGGC 


300 


ACTAAAGGAG 


ACCTTGACTA 


GGGGCCAGGT 


GGCATCATCT 


AATCCCAAGG 


GGCTCCAAGT 


360 


GAGTATTAGG 


GTGGGGGAAG 


ACATTATAGA 


AGGAATAGAA 


ACAGGATAGC 


TCAGCCTAAA 


420 


GAAGAGCGGT 


TAAAACCCTA 


CCCACCAGGA 


GTTGACTTGA 


AAGAGGCCCC 


TATGGAGGAA 


480 


TCCCCAACCA 


CCAAAAGCAA 


TCTTGAGCTG 


CAGCTGCTTC 


ATTTAGTGGA 


CCTTGTGTAT 


540 


ATCTGGGTGT 


GTATGCACAT 


AGATAGACAG 


TGAGAAAGAA AACTGTTCTT CCAGTTCTTT 


600 


TCCAGTGCTA 


CTAGCTTAGG 


GACAGGTTAG 


AACTGTCTGC 


ACAATTGTGT 


GATCATTCCC 


660 


ATTCCCACTT 


CAAAACAAAC 


TGACTGAGAT 


GTTCAACAGA 


AAACTGGCTT 


CAATGGGTAA 


720 


CATGCCCTTG 


CCACTTAC TT 


AAGACACTGG 


TGTGATGGGG 


TTTTGAACTC 


CCTATATTTG 


780 


TAGGTATCTG 












790 


(2) INDICATIONS AS 


TO ID NO: 


5: 








(i) 


SEQUENCE 


CHARACTERISTICS : 









(A) LENGTH: 841 base pairs 

(B) KIND: nucleotide 

(C) STRAND FORM: not known 

(D) TOPOLOGY: not known 

(ii) KIND OF MOLECULE: cDNA 

(xij SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

TTGCTGCAGA TACTACTGAC CAGACAAGCT GTTGACCAGG CACCTCCCCT CCCGCCCAAA 60 

CCTTTCCCCC ATGTGGTCGT TAGAGACAGA GCAGTTGAGA GGACACTCCC GTTTTCGGTG 120 

CCATCAGTGC CCCGTCTACC ACTCCCCCAG CTCCCCCCAC CTCCCCCACT CCCAACCACG 180 

TTGGGACAGG GAGGTGTGAG GCAGGAGAGA CAGTTGGATT CTTTAGAGAT GGATGTGACC 240 
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AGTGGCTATG 


GCCCGTGCGA 


TCCCACCCGT 


GGCGGCTCAA 


ATCTGGCCCC 


ACCCCAGCCC 


300 


CAATCCAAAA 


CTGGCAAGGA 


CGCTTCACAG 


GACAGGAAAG 


TGGCACCTGT 


CTGTTCCGGC 


360 


ATGGCTAGGA 


GGGAGTTGTC 


CCTTGAACTA 


CTGGGTGTAG 


ACTGGCCTAA 


ATCACAGGAG 


420 


AGGATGGCCC 


AGGGTGAGGT 


GGCATGGTCC 


ATTCTCAAGG 


GACGTCCTCC 


AGTTGGTGGC 


480 


ACTAGAGAGG 


CCATGGAGGC 


AGTAGGACAA 


GGCACAGGCA 


GGCTGGCCCA 


GGGTCAGGCC 


540 


GGGCCGAACA 


CAGCGGGGTG 


AGAGGGATTC 


CTCGTCTCAG 


AGCAGTCTGT 


GACCGGTAGT 


600 


TAGGGACTTA 


GTGGACAGGG 


AAGGGGCAAA 


GGGGGAGGAG 


AAGAAAATGT 


TCTTCCAGTT 


660 


ACTTTCCAAT 


TCTACTCCTT 


TAGGGACAGC 


TTAGAATTAT 


TTGCACTATT 


GAGTCTTCAT 


720 


GTTCCCACTT 


CAAAACAAAC 


AGATGCTCTG 


AGAGCAAACT 


GGCTTGAATT 


GGTGACGTTT 


780 


AGTCCCTCAG 


GCCACCAGAT 


GTGATGGTGT 


TGAGAACTAC 


CTGGATATGT 


ATATATACCT 


840 



G 841 

(2) INDICATIONS AS TO ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 846 base pairs 

(B) KIND: nucleotide 

(C) STRAND FORM: not known 

(D) TOPOLOGY: not known 

(ii) KIND OF MOLECULE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



TTGCTGCAGA 


TACTACTGAC 


CAGACAAGCT 


GTTGACCAGG CACCTCCCCT 


CCCGCCCAAA 


60 


CCTTTCCCCC 


ATGTGGTCGT 


TAGAGACAGA 


GCAGTTGAGA GGACACTCCC 


GTTTTCGGTG 


120 


CCATCAGTGC 


CCCGTCTGCA 


GCTCCCCCAG 


CTCCCCCCAC CTCCCCCACT 


CCCAACCACG 


180 


TTGGGACAGG 


GAGGTGTGAG 


GCAGGAGAGA 


CAGTTGGATT CTTTCGAGAA GATGGATATG 


240 


ACCAGTGGCC 


ATGGCCTGTG 


CGATCCCACC 


CGTGGCGGCT CAAGTCTGGC 


CCCACACCAG 


300 


CCCCAATCCA 


AAACTGGCAA 


GGACGCTTCA 


CAGGACAGGA AAGTGGCACC 


TGTCTGCTCC 


360 


AGCTCTGGCA 


TGGCTAGGAG 


GGAGTCGTCC 


CTTGAACTAC TGGGTGTAGA 


CTGGCCTGAA 


420 


CCACAGGAGA 


GGATGGCCCA 


GGGTGAGGTG 


GCATGGTCCA TTCTCAAGGG 


ACGTCCTCCA 


480 


ACGGGTGGCG 


CTAGAAAGGC 


CATGGAGGCA 


GTAGGACAAG GCGCAGGCAG 


GCTGGCCCGG 


540 


GGTCAGGCCG 


GGCAGGGCAC 


AGC GGGGTGA 


GAGGGATTCC TAATCACTCA 


GAGCAGTGTG 


600 


TGACTGGTAG 


TTAGGGACTC 


AGTGGACAGG 


GGAGGGGCGA GGGGGCAGGA 


GAAGAAAATG 


660 


TTCTTCCAGT 


TACTTTCCAA 


TTCTCCTTTA 


GGGACAGCTT AGAATTATTT 


GCACTATTGA 


720 


GTCTTCATGT 


TCCCACTTCA 


AAACAAACGA 


TGCTCTGAGA GCAAACTGGC 


TTGAATTGGT 


780 


GACATTTAGT 


CCCTCAAGCC 


ACCAGATGTG 


AGTGTTGAGA ACTACCTGGA 


TTTGTATATA 


840 
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TACCTG 846 
(2) INDICATIONS AS TO ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 813 base pairs 

(B) KIND: nucleotide 

(C) STRAND FORM: not known 

(D) TOPOLOGY: not known 

(ii) KIND OF MOLECULE: cDNA 



(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


NO: 7 : 






TTGCTGCAGA 


TACTACTGAC 


CAGACAAGCT 


GTTGACCAGG 


CACTCCCCAC 


AACAACAACC 


60 


CCGTCCCTCC 


TCACCCCACC 


CCTATCCCCT 


GTGTGCTCAT 


TAGAGAGGGC 


AATTGAGAGG 


120 


ACACTCCCAT 


TTTTGGTGCC 


ACTGATGCCC 


TGTCCATAGC 


TTCCCTGACT 


TTTACACCAC 


180 


CCCAACTCCC 


AATCTGAGGG 


ACTGGGAGGT 


GTGACGCAGG 


AGAAACTATA 


TAGGACTCTT 


240 


GGGAGAAGAC 


TATAGAGTTG 


GCAAGTGATT 


GCGCCCCAGT 


AATTCCAACT 


GTGGTAGCAC 


300 


AAGTCTGGCT 


CCACACCAAC 


CCAATCCAAA 


ACTGACAAGG 


ACATTTTGCA 


AAAAATGAAA 


360 


GTGGCATTTG 


TCTGATCCAG 


CTCTGGCATG 


GCTAGAGATG 


AGTCTTAAAC 


TGTTGGCTTA 


420 


TAAACTGGCC 


TGAGCAACAG 


AAGAGGATGG 


CCCAGAGTAA 


AGTGT CATC A 


TCTGTTCACA 


480 


AGGCATGCTC 


CCCTAGAAGT 


TCATGCTAAA 


GAAGTGCCAT 


GGAGGCAGCA 


GGACAAAGTA 


540 


CAGGCTAGGT 


GGAGTCAAGC 


CAGGCCTAGT 


GCCACAGAGC 


AAGAGAGCAG 


TCTCTGACTA 


600 


GTAGTTAAGG 


GGGAAGAAAG 


AAAAATATTC 


TTCCAATTGC 


TTTCCAGTTC 


TCCTTTAGGG 


660 


ACAGCTTAGA 


ATTATTTGCA 


CTATTGAGTC 


TTCATGTTCC 


CACTTCAAAA 


CAAATAGATG 


720 


CTCTGAAAGC 


AAACTGGCTT 


GAAATGGTGA 


CACTGTCCCA 


CAAGCCACCA 


GACAATGGCA 


780 


GTGTTCAGAA 


CTACCTGTAT 


ATGTATATAC 


CTG 






813 



(2) INDICATIONS AS TO ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 842 base pairs 

(B) KIND: nucleotide 

(C) STRAND FORM: not known 

(D) TOPOLOGY: not known 

(ii) KIND OF MOLECULE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
TTGCTGCAGA TACTACTGAC CAGACAAGCT GTTGACCAGG CACCTCCCCT CCCGCCCAAA 60 
CCTTTCCCCC ATGTGGTCGT TAGAGACAGA GCGACAGAGC AGTTGAGAGG ACACTCCCGT 120 
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TTTCGGTGCC ATCAGTGCCC CGTCTACAGC TCCCCCAGCT CCCCCCACCT CCCCCACTCC 180 

CAACCACGTT GGGACAGGGA GGTGTGAGGC AGGAGAGACA GTTGGATTCT TTAGAGAAGA 240 

TGGATATGAC CAGTGGCTAT GGCCTGTGTG ATCCCACCCG TGGTGGCTCA AGTCTGGCCC 3 00 

CACACCAGCC CCAATCCAAA ACTGGCAAGG ACGCTTCACA GGACAGGAAA GTGGCACCTG 360 

TCTGCTCCAG CTCTGGCATG GCTAGGAGGG GGGAGTCCCT TGAACTACTG GGTGTAGACT 420 

GGCCTGAACC ACAGGAGAGG ATGGCCCAGG GTGAGGTGGC GTGGTCCATT CTCAAGGGAC 480 

GTCCTCCAAC GGGTGGCGCT AGAGGCCATG GAGGCAGTAG GACAAGGCGC AGGCAGGCTG 540 

GCCCGGGGTC AGGCCGGGCA GAGCACAGCG GGGTGAGAGG GATTCCTAAT CACTCAGAGC 600 

AGTCTGTGAC TTAGTGGACA GGGGAGGGGG CAAAGGGGGA GGAGAAGAAA ATGTTCTTCC 660 

AGTTACTTTC CAATTCTCCT TTAGGGACAG CTTAGAATTA TTTGCACTAT TGAGTCTTCA 720 

TGTTCCCACT TCAAAACAAA CAGATGCTCT GAGAGCAAAC TGGCTTGAAT TGGTGACATT 780 

TAGTCCCTCA AGCCACCAGA TGTGACAGTG TTGAGAACTA CCTGGATTTG TAT AT AT AC C 840 

TG 842 



